Using
Macrons on Web Pages
David
J. Perry
Member,
Educational Computer Applications Committee, ACL
Chairperson,
CAES Font Project
Updated
This document is written
for Latin teachers; hence the stress on macrons. However, the principles are applicable to
using a variety of non-English characters on a web page. I assume that readers are creating web pages
and therefore have a basic knowledge of HTML.
If you want to download an Adobe Acrobat (PDF) version to print or read
offline, click on this link [not yet active].
Can you get macrons, breve
marks and other special characters on a web page? Yes, you can, with some limitations as
explained below. This paper presents
some brief background information and then discusses two methods, one
applicable to more recent web browers and another, more limited, one that is
supported by any browser.
You need to know about two
concepts: HTML entities and Unicode.
Non-English
characters (e.g., an o with an acute accent or an n with a tilde) and various
symbols (copyright, trademark, etc.) are referred to as HTML entities. They are entered into HTML documents with
the ampersand prefix and with a closing semicolon, as in the following
examples:
ó ó (both represent the character ó)
ñ ñ (both represent the character ń)
® ® (both represent the registered symbol ®)
You can see that there are
two ways of working with these characters: as numeric entities with a
pound sign # followed by a number, or as character entities with a short
descriptive name. The HTML specification
provides names for the characters and symbols most commonly used in European
languages. A list of HTML 4 entities
is available online from http://www.w3.org/TR/1999/REC-html401-19991224/sgml/entities.html#iso-88591
as well as in books that teach HTML.
Other characters—including macrons and breve marks—can be entered by
number only.
Where do the numbers come
from? The European characters (the ones
with both names and numbers) come from a character set known as Latin-1 or as
ISO 8859-1. The numbers for all other
characters come from the Unicode Standard. Unicode is a character encoding system that
includes almost all the characters found in living languages, plus quite a
number of characters used in historical or scholarly languages. Begun in 1990, Unicode has now emerged as the
standard for character encoding. All
recent operating systems (Windows 95/98/Me/2000/XP as well as Mac OS 8/9/X) and
newer browsers (Netscape Navigator 4.7, Internet Explorer 5, OmniWeb 4, or more
recent versions of these) support Unicode, with the most recent incarnations
having the best Unicode support.
To load a simple test page
that checks whether your browser can display macrons with Unicode, click here.
If you want more
information about Unicode, older fonts, and related issues, see the
book Word Processing
for Classicists available from http://members.telocity.com/~perryd
.
Advantages:
·
uses
the international standard for characters
·
is
the way of the future
·
supports
a huge repertoire of characters for almost all languages
Disadvantages:
·
works
only with more recent versions of browsers that are written to support Unicode
(Netscape Navigator 4.7, Internet Explorer 5, Opera 6, OmniWeb 4, or more
recent versions of these)
·
works
only with Windows 95 or more recent, or with Mac OS 8.1 [8.5??] or more recent
I strongly suggest using
Unicode if at all possible. The only
reason not to is if you want to be absolutely sure that your web pages will be
readable in any version of any browser (however, in order to accomplish this
you have to make certain that the user has the same font on his system that you
used to create the page, which presents another problem). The large majority of users have recent
versions of Netscape Navigator or Internet Explorer that can handle Unicode
just fine. If you are creating web pages
for students to use in your school computer lab or on a school network, then
you know for sure what browser they will use.
If your pages are meant to be accessible to anyone who comes across them
on the web, you need to weigh the benefits of Unicode versus the fact that some
users may still be running older versions of Netscape and IE. For more information about using Unicode on
the web, see Patrick Rourke's excellent page at http://www.stoa.org/unicode/; he has
tested a number of browsers for use with Greek.
The Times New Roman,
You will enter macroned
vowels and other special characters as numeric entities. You therefore need to know the Unicode values
of the characters you want. Most Unicode
charts give the character values in hexadecimal (base 16) notation, but you
must use regular decimal notation when you write the numeric entities. For convenience, the most common characters
of interest to Latin teachers are given in the Appendix at the end of this
paper, in both hex and decimal.
If you ever need to do
this conversion yourself, the Windows Calculator applet can do it for you:
·
activate
the Start button and choose Programs/Accessories/Calculator
·
select
View/Scientific from the menu bar
·
click
the Hex radio button at the upper left (or hit the F5 key) and enter the hexadecimal
value
·
click
the Decimal button (or hit the F6 key) and the hex value will be converted to
decimal
Once you know the decimal
values you need, simply put a numeric entity in the spot where you want the
special character to appear. For example,
to enter on your web page this phrase:
pilā, with a ball, is different from pīla,
javelins
you would type in your
HTML editor (code for bold and italic is omitted here for clarity):
pilā, with
a ball, is different from pīla, javelins
Note the numeric entities
with opening ampersand and pound sign plus closing semicolon.
If you see boxes or
question marks instead of your vowels with macrons when you test your web page,
you may need to adjust the settings in your browser. In Internet Explorer, choose Tools/Internet
options from the menu bar, then click the Fonts button at the bottom of the
window. Change the default font for
Latin-based languages to any of the fonts mentioned above (Times New Roman,
Advantages:
·
can
work with any browser or operating system, even older ones or ones that do not
yet support Unicode
Disadvantages:
·
user
must have specific font(s) installed, since there is no standard in non-Unicode
fonts for macrons and other special characters of interest to classicists
You will enter macroned
vowels and other special characters as numeric entities or as character entities. Since older (non-Unicode) Windows and Mac
fonts do not normally contain macroned vowels, and because the number of
characters in such fonts is limited, the creators of fonts for Latinists
replace some standard characters with characters for Latin. There any number of ways this might be done,
but one common method is to replace all vowels with umlauts by vowels with
macrons. This is a good system since, if
text prepared using such a font ever has to be displayed on a system that lacks
the font, the text is still usable; all a-macrons become a-umaluts, and so
forth, instead of seemingly random characters that would be hard to interpret.
You must find out from the
font documentation the names and numbers of the macroned vowels. You must also be sure to tell those who
view your web page what font was used to create the page and how they can get
it (or another font with the same character encoding); otherwise, the special
characters will not display properly. This
might be a workable situation in a school computer lab, for example, which
which is still running an older browser that does not support Unicode, but
where the teacher can make sure that a specific font is available on all
machines.
Once you have the
information about characters names and numbers, simply enter the characters as
either numeric or character entities. The
following example applies to any font where macrons replace umlauts, including
the GaramondLatin font from CAES. To
enter this text on your web page:
pilā, with a ball, is different from pīla,
javelins
you would type in your HTML
editor one of the following (for clarity I omit the code for boldface and
italics):
pilā, with
a ball, is different from pīla, javelins
pilä, with
a ball, is different from pïla, javelins
In this example we intend
to use a font in which macroned vowels have replaced umlauted ones, so we can
use the HTML names for umlauted vowels (uml is short for umlaut), or the numbers. Remember that all entities begin with an
ampersand and end with a semicolon. You
must use the appropriate names or numbers if your macron font is set up
differently from the ones I have described!
And be sure to specify the font when you prepare your HTML code.
Whether you use Unicode or
an older font, typing can be slow if you need to enter a lot of macroned
vowels. If your HTML editor supports
macros, you can create a macro that will enter the code for each lowercase
vowel with macron and assign the macro to a hot key. Or pick a character that is not used for
anything else (such as HTML tags) and type it after vowels you want to add
macrons to; then do a search and replace.
For example, if you typed a$ wherever you wanted a-macron, then you could
search for a$ and
replace it with e
which is the Unicode value for a-macron; do likewise for other vowels. The search-and-replace operation itself could
be made into a macro.
If you are using a WYSIWYG
editor (i.e., you see how the actual page will look as you create it, instead
of entering actual HTML codes), you can probably enter the special Latin
characters using the facilities provided by your operating system. On the Mac, you can enter umlauts by typing option-U then a vowel. In Windows, you can turn on num lock and type on the numeric keypad
0 (zero) plus the three-digit character code.
The documentation for the GaramondLatin font from CAES goes into much
more detail about entering non-Unicode characters, and my web page http://members.telocity.com/~perryd
has more information about entering Unicode.
If you read the list of
HTML entities carefully, you will note the presence of ¯ and its equivalent ¯ . This is
in fact the macron. However, if you use
it in an HTML page, you will get a vowel with a macron coming after instead of
over the vowel, like this: aˉ. Our software is not yet at the point where
we can automatically position accents over base characters, although this may
be possible in a few years. At the
moment we need precomposed characters for each combination of vowel and
diacritic.
Macrons are helpful to
beginners as guides to pronunciation as well as to distinguish grammatical
forms and certain words (e.g., iacēre to lie vs. iacĕre
to throw). Most recent books make
use of them, and it is now possible for teachers who are developing web-based
materials to do likewise.
Questions or suggestions
for improvements may be sent to perryd2@compuserve.com
.
Information about the
GaramondLatin font may be obtained from the home page of the Classical Association of the Empire State. This is the only font specifically designed
for the needs of Latin teachers, and contains many characters in addition to
macroned vowels.
Char. Decimal Hexadecimal
Ā 256 0100
ā 257 0101
Ă 258 0102
ă 259 0103
Ē 274 0112
ē 275 0113
Ĕ 276 0114
ĕ 277 0115
Ī 298 012A
ī 299 012B
Ĭ 300 012C
ĭ 301 012D
Ō 332 014C
ō 333 014D
Ŏ 334 014E
ŏ 335 014F
Ū 362 016A
ū 363 016B
Ŭ 364 016C
ŭ 365 016D
Yˉ 562 0232
yˉ 563 0233
These names and numbers
will work for any other font where vowels with umlauts have been replaced by
vowels with macrons. For instance, some
Latin teachers have used fonts designed for Hawai’ian that are constructed this
way (although these lack Y/y macron).
Note that the names are case-sensitive, that is, Auml is
different than auml.
Char. Decimal HTML
Name
Ā 196 Auml
ā 228 auml
Ē 203 Euml
ē 235 euml
Ī 207 Iuml
ī 239 iuml
Ō 214 Ouml
ō 246 ouml
Ū 220 Uuml
ū 252 uuml
Yˉ 376 Yuml
**
yˉ 255 yuml
** This entity may not be
supported
in older versions of browsers.