Using Macrons on Web Pages

 

David J. Perry

Rye High School, Rye NY

Member, Educational Computer Applications Committee, ACL

Chairperson, CAES Font Project

Updated January 2, 2002

 

This document is written for Latin teachers; hence the stress on macrons.  However, the principles are applicable to using a variety of non-English characters on a web page.  I assume that readers are creating web pages and therefore have a basic knowledge of HTML.  If you want to download an Adobe Acrobat (PDF) version to print or read offline, click on this link [not yet active].

 

Can you get macrons, breve marks and other special characters on a web page?  Yes, you can, with some limitations as explained below.  This paper presents some brief background information and then discusses two methods, one applicable to more recent web browers and another, more limited, one that is supported by any browser.

 

Background

You need to know about two concepts: HTML entities and Unicode.

 

Non-English characters (e.g., an o with an acute accent or an n with a tilde) and various symbols (copyright, trademark, etc.) are referred to as HTML entities.  They are entered into HTML docu­ments with the ampersand prefix and with a closing semicolon, as in the following examples:

            ó        ó         (both represent the character ó)

            ñ        ñ          (both represent the character ń)

            ®        ®              (both represent the registered symbol ®)

 

You can see that there are two ways of working with these characters: as numeric entities with a pound sign # followed by a number, or as character entities with a short descriptive name.  The HTML specification provides names for the characters and symbols most commonly used in European lan­guages.    A list of HTML 4 entities is available online from http://www.w3.org/TR/1999/REC-html401-19991224/sgml/entities.html#iso-88591 as well as in books that teach HTML.  Other characters—including macrons and breve marks—can be entered by number only.

 

Where do the numbers come from?  The European characters (the ones with both names and num­bers) come from a character set known as Latin-1 or as ISO 8859-1.  The numbers for all other characters come from the Unicode Standard.  Unicode is a character encoding system that includes almost all the characters found in living languages, plus quite a number of characters used in his­torical or scholarly languages.  Begun in 1990, Unicode has now emerged as the standard for char­acter encoding.  All recent operating systems (Windows 95/98/Me/2000/XP as well as Mac OS 8/9/X) and newer browsers (Netscape Navigator 4.7, Internet Explorer 5, OmniWeb 4, or more recent ver­sions of these) support Unicode, with the most recent incarnations having the best Unicode support.

 

To load a simple test page that checks whether your browser can display macrons with Unicode, click here.

 

If you want more information about Unicode, older fonts, and related issues, see the

book Word Processing for Classicists available from http://members.telocity.com/~perryd .

 

Unicode Method

Advantages:

·        uses the international standard for characters

·        is the way of the future

·        supports a huge repertoire of characters for almost all languages

Disadvantages:

·        works only with more recent versions of browsers that are written to support Unicode (Net­scape Navigator 4.7, Internet Explorer 5, Opera 6, OmniWeb 4, or more recent versions of these)

·        works only with Windows 95 or more recent, or with Mac OS 8.1 [8.5??] or more recent

 

Basic Information

I strongly suggest using Unicode if at all possible.  The only reason not to is if you want to be ab­solutely sure that your web pages will be readable in any version of any browser (however, in order to accomplish this you have to make certain that the user has the same font on his system that you used to create the page, which presents another problem).  The large majority of users have recent versions of Netscape Navigator or Internet Explorer that can handle Unicode just fine.  If you are creating web pages for students to use in your school computer lab or on a school network, then you know for sure what browser they will use.  If your pages are meant to be accessible to anyone who comes across them on the web, you need to weigh the benefits of Unicode versus the fact that some users may still be running older versions of Netscape and IE.  For more information about using Unicode on the web, see Pat­rick Rourke's excellent page at http://www.stoa.org/unicode/; he has tested a number of brows­ers for use with Greek.

 

The Times New Roman, Arial, Georgia and Tahoma fonts that ship with recent versions of Windows (and the Mac OS??) are Unicode fonts, and if you use them in your documents you can be sure that almost all users will see the characters properly. One minor problem: these fonts are based on version 2 of Unicode, and Y-macron and y-macron were added in Unicode 3.  We hope that these fonts will be updated in future OS releases.  Fortunately the number of Latin words requiring y-macron is quite small (perist˙lium and pap˙rus are two common examples—since I can't count on your having a font with y-macron, I resorted to an umlaut here).  One freely available Unicode font, designed for classicists and other scholars, is Cardo (see http://members.telocity.com/~perryd); it does include Y/y macron.  See any HTML textbook for the methods of specifying a font to use when displaying a page.  If you use the Unicode values for Y/y macron and user's font does not support these characters, they will display as question marks or rectangles; no harm is done.

 

Entering the Characters

You will enter macroned vowels and other special characters as numeric entities.  You therefore need to know the Unicode values of the characters you want.  Most Unicode charts give the char­acter values in hexadecimal (base 16) notation, but you must use regular decimal notation when you write the numeric entities.  For convenience, the most common characters of interest to Latin teachers are given in the Appendix at the end of this paper, in both hex and decimal.

 


If you ever need to do this conversion yourself, the Windows Calculator applet can do it for you:

·        activate the Start button and choose Programs/Accessories/Calculator

·        select View/Scientific from the menu bar

·        click the Hex radio button at the upper left (or hit the F5 key) and enter the hexadecimal value

·        click the Decimal button (or hit the F6 key) and the hex value will be converted to decimal

 

Once you know the decimal values you need, simply put a numeric entity in the spot where you want the special character to appear.  For example, to enter on your web page this phrase:

pilā, with a ball, is different from pīla, javelins

you would type in your HTML editor (code for bold and italic is omitted here for clarity):

pilā, with a ball, is different from pīla, javelins

Note the numeric entities with opening ampersand and pound sign plus closing semicolon.

 

Displaying the Results

If you see boxes or question marks instead of your vowels with macrons when you test your web page, you may need to adjust the settings in your browser.  In Internet Explorer, choose Tools/Internet options from the menu bar, then click the Fonts button at the bottom of the window.  Change the default font for Latin-based languages to any of the fonts mentioned above (Times New Roman, Arial, Georgia, or Tahoma).  If you have a different font on your system which you know has complete Unicode support, such as Cardo, you can choose it also.  In Netscape, choose Edit/Preferences, then Appearance/Fonts.

 

Older Fonts Method

Advantages:

·        can work with any browser or operating system, even older ones or ones that do not yet support Unicode

Disadvantages:

·        user must have specific font(s) installed, since there is no standard in non-Unicode fonts for macrons and other special characters of interest to classicists

 

Basic Information

You will enter macroned vowels and other special characters as numeric entities or as character en­tities.  Since older (non-Unicode) Windows and Mac fonts do not normally contain macroned vow­els, and because the number of characters in such fonts is limited, the creators of fonts for Latinists replace some standard characters with characters for Latin.  There any number of ways this might be done, but one common method is to replace all vowels with umlauts by vowels with macrons.  This is a good system since, if text prepared using such a font ever has to be displayed on a system that lacks the font, the text is still usable; all a-macrons become a-umaluts, and so forth, instead of seemingly random characters that would be hard to interpret.

 

You must find out from the font documentation the names and numbers of the macroned vowels.  You must also be sure to tell those who view your web page what font was used to create the page and how they can get it (or another font with the same character encoding); otherwise, the special char­acters will not display properly.  This might be a workable situation in a school computer lab, for example, which which is still running an older browser that does not support Unicode, but where the teacher can make sure that a specific font is available on all machines.

 

Entering the Characters

Once you have the information about characters names and numbers, simply enter the characters as either numeric or character entities.  The following example applies to any font where macrons replace umlauts, including the GaramondLatin font from CAES.  To enter this text on your web page:

pilā, with a ball, is different from pīla, javelins

you would type in your HTML editor one of the following (for clarity I omit the code for boldface and italics):

pilā, with a ball, is different from pīla, javelins

pilä, with a ball, is different from pïla, javelins

In this example we intend to use a font in which macroned vowels have replaced umlauted ones, so we can use the HTML names for umlauted vowels (uml is short for umlaut), or the numbers.  Remember that all entities begin with an ampersand and end with a semicolon.  You must use the appropriate names or numbers if your macron font is set up differently from the ones I have described!  And be sure to specify the font when you prepare your HTML code.

 

Speeding Things Up

Whether you use Unicode or an older font, typing can be slow if you need to enter a lot of macroned vowels.  If your HTML editor supports macros, you can create a macro that will enter the code for each lowercase vowel with macron and assign the macro to a hot key.  Or pick a character that is not used for anything else (such as HTML tags) and type it after vowels you want to add macrons to; then do a search and replace.  For example, if you typed a$ wherever you wanted a-macron, then you could search for a$ and replace it with e which is the Unicode value for a-macron; do likewise for other vowels.  The search-and-replace operation itself could be made into a macro.

 

If you are using a WYSIWYG editor (i.e., you see how the actual page will look as you create it, instead of entering actual HTML codes), you can probably enter the special Latin characters using the facilities provided by your operating system.  On the Mac, you can enter umlauts by typing option-U then a vowel.  In Windows, you can turn on num lock and type on the numeric keypad 0 (zero) plus the three-digit character code.  The documentation for the GaramondLatin font from CAES goes into much more detail about entering non-Unicode characters, and my web page http://members.telocity.com/~perryd has more information about entering Unicode.

 

Note for the Observant

If you read the list of HTML entities carefully, you will note the presence of ¯ and its equivalent ¯ .  This is in fact the macron.  However, if you use it in an HTML page, you will get a vowel with a macron coming after instead of over the vowel, like this:  aˉ.   Our software is not yet at the point where we can automatically position accents over base characters, although this may be possible in a few years.  At the moment we need precomposed characters for each combination of vowel and diacritic.

 

Conclusion

Macrons are helpful to beginners as guides to pronunciation as well as to distinguish grammatical forms and certain words (e.g., iacēre to lie vs. iacĕre to throw).  Most recent books make use of them, and it is now possible for teachers who are developing web-based materials to do likewise.

 

Resources

Questions or suggestions for improvements may be sent to perryd2@compuserve.com .

 

Information about the GaramondLatin font may be obtained from the home page of the Classical Association of the Empire State.  This is the only font specifically designed for the needs of Latin teachers, and contains many characters in addition to macroned vowels.

 

 

 

Appendix: Character Codes

 

Unicode Fonts


Char.   Decimal    Hexadecimal

Ā         256           0100

ā          257            0101

Ă         258           0102

ă          259           0103

Ē          274            0112

ē          275            0113

Ĕ          276            0114

ĕ          277            0115

Ī          298           012A

ī           299           012B

Ĭ          300           012C

ĭ           301           012D

Ō         332           014C

ō          333           014D

Ŏ         334           014E

ŏ          335           014F

Ū         362           016A

ū          363           016B

Ŭ         364           016C

ŭ          365           016D

       562           0232

        563           0233


 

 

CL Fonts GaramondLatin

These names and numbers will work for any other font where vowels with umlauts have been replaced by vowels with macrons.  For instance, some Latin teachers have used fonts designed for Hawai’ian that are constructed this way (although these lack Y/y macron).  Note that the names are case-sensitive, that is, Auml is different than auml.

 


Char.   Decimal    HTML Name

Ā         196            Auml

ā          228           auml

Ē          203           Euml

ē          235           euml

Ī          207           Iuml

ī           239           iuml

Ō         214            Ouml

ō          246           ouml

Ū         220           Uuml

ū          252           uuml

       376            Yuml **

        255           yuml


  ** This entity may not be supported
in older versions of browsers.