TTF_EDIT

A TrueType font table editing tool

Users Guide

Release 0.91, 16 Oct 1998
Copyright 1998
Richard J. Kinch kinch@truetex.com
Unauthorized copying of this documentation or the
accompanying software is prohibited.

Contents

Understanding TrueType Font Editing

Editing TrueType encodings is difficult or impossible in the conventional graphical font editor applications (such as Fontographer, Font Lab, Type Designer, or FontMonger). This TrueType font utility takes a different approach to editing font information. While all of the popular Windows and Macintosh font editors provide a graphical user interface, ttf_edit provides a language-based interface. The compelling benefit of this design is that it provides a much simpler and more powerful method for you to manipulate the bulk of a TrueType font's information.

Most of the data in a TrueType font file is tabular data (names, encodings, metrics, kerning), not graphical shapes (glyph outlines). The tables dealing with encoding and naming are especially appropriate for manipulation through a language. Indeed, the only aspect of a TrueType font which is compellingly appropriate for a graphical interface is the glyph shapes (either the outlines or bit maps).

The ttf_edit editor uses a simple stack-based language to read and modify TrueType font files. It is a command-line program, which you run in the console environment of your operating system: either MS-DOS, an MS-DOS window in Windows 3.1, a WIN32 console window in Windows 95/98/NT, or a Unix shell. You control the program with the arguments you give on the command line. You can thus display and modify encodings and other contents of the font in a very powerful and general way. Using ttf_edit, you convert information from the font (such as encodings or names) into textual tables, which you then can edit using a conventional text editor; then you use ttf_edit to insert these modified tables back into the font to create a modified version of the font.

This is not a font design tool: It will not change the shape or hinting of glyphs in a font (not yet, anyway), it will not add or remove glyphs from a font (from encodings, yes, but not from the set of glyphs available in the font), nor will it convert between TrueType and Type 1 (ATM) formats.

The lack of effect on hinting in ttf_edit is positively virtuous: you can change encodings in TrueType fonts, without altering their appearance! If you open, modify, and save a TrueType font in commercial font editors, you typically lose the original TrueType hints and replace them with something greatly inferior.

You can convert TrueType fonts between the so-called Microsoft, Apple, and Macintosh variants of the TrueType format, since these variations are simply configurations of encoding and naming tables (however, if the Macintosh font doesn't contain the full set of Windows glyphs with the proper names, the Windows version will be incomplete). But ttf_edit only supports individual fonts (.ttf files); it will not make TrueType collections (.ttc) fonts from individual (.ttf) fonts, or vice versa (although Microsoft has free tools to do so), nor does it at present allow you to combine glyphs from several fonts into a new font.

What ttf_edit *will* do better than any visual approach is to report or modify encoding(s) or name(s) of TrueType fonts.

For a quick start, see the EXAMPLES section below.

Notation in This Document

Keywords, constants, and other ttf_edit commands to be typed into the computer are denoted in this document in a typewriter typeface. Values meant to be substituted are in an italic typeface. For example:

This is a keyword in the typewriter face. Many commands take arguments, such as a filename, or a constant value like M, such as decimal 3 or hexadecimal 0x20.

Using ttf_edit

To run ttf_edit, you run the program using the operating system's command-line interface (Windows console, Linux shell, etc.) and supply a set of arguments to perform some task:

ttf_edit [stack-command] ...

The ttf_edit interpreter executes the arguments you provide using a stack-oriented model, much like a PostScript interpreter. In other words, you use a language to tell ttf_edit to do what you want. To do simple things to a font, you can just give arguments on the command line. To do more complicated things, you will want to edit a file containing commands, and have ttf_edit take commands from that file, as explained in detail below.

Running ttf_edit without arguments will simply display a copyright and version message:

    ttf_edit: Copyright 1998 Richard J. Kinch (kinch@truetex.com)
    For documentation and help see the file ttf_edit.htm
    For updates and support see http://truetex.com
    Release release-number for operating-system, date

Stack Parsing and Data Types

The command input to ttf_edit comes initially from the command line arguments, which may in turn refer to commands in files. The interpreter removes each argument, one at a time, from the command input, from left to right; pushes this argument as a string on the stack; and "executes the stack contents." "Executing the stack contents" means that if the top item is a string which names a command, then that string is popped from the stack, and the associated command executes; otherwise the stack remains unchanged. When most commands execute, they affect the stack contents in some useful way.

While ttf_edit is holding objects on the stack, it retains them as either strings or font objects. This loose typing of objects is much more relaxed than, for example, PostScript. The ttf_edit interpreter coerces strings into integers, commands, or other types as needed to fit the syntax of each command that affects the stack. If an error in your input causes this coercion to fail (for example, if you were to give a non-numeric argument to a command expecting an integer), then ttf_edit will emit various error messages and exit. Note that this simple parsing method does not allow any command name to be used as a file name; this is not a problem since file names usually carry a dot-extension and no command name contains a dot.

Besides affecting the stack, commands can also have side effects such as output to files or the standard output.

Normally each word of input (separated by white space) is considered a separate input token for the stack, either a command or an argument to a command. You can specify strings containing white space by surrounding them with parentheses, similar to the PostScript language syntax (but the parser is still a bit incomplete, so this only works in files executed by the run command, not on the command line arguments themselves).

Commands and Operators

The following is a list of all the commands ttf_edit understands. Before studying these command descriptions, which can be difficult to understand in isolation, you may want to skip below to the "EXAMPLES" section to get a better sense of how you usually run ttf_edit. We describe the effect of each command below in the same style as is used in Adobe's PostScript Language Reference: the command keyword is preceded by the contents of the stack before the command is applied, and followed by the contents of the stack after the command is applied.

item pop -

Pops off and discards the top item on the stack.

item dup item item

Pushes a duplicate reference to the top item onto the stack. Another reference to the object, not a second memory allocation, is then on the stack.

filename font font

Allocates memory storage for a font object, reads a font file into that object, and pushes a reference to the font object onto the stack. A font object consists of both in-memory copies of the font file's information and references to portions of the font file itself. The font file will remain opened for reading by the font object until the font object is free'd.

font list -

Displays a table of font tables, followed by all contents of "name" and "cmap" tables, for all platforms, on the standard output. Useful for discovering what encodings are available in a TrueType font and by what names the font may be known. Displays any Unicode names in ASCII by truncating the high byte. See also listname.

The listing of font tables will give for each entry: the table name, checksum and offset in the original file (not the same as what will be written in a generated font); length of the table in the original file, if there is no imported data, or else the length of the imported data; a pointer to the in-memory elaborated data structure for the table (applies to certain tables only like cmap and name); and any pointer to imported data from an import command. A pointer value of zero means that the respective table has not been elaborated in memory or has no imported data.

After the list of font tables, list shows all of the naming table entries; the usual font will have dozens of such entries. Each entry is displayed as in the following example:

    ===================================
    Name 1 0 0x0000 1:
    Macintosh Roman (English)
    font family name = "Times New Roman"
    ===================================

which gives, respectively, the platform ID, encoding, language, and name codes, followed by a more meaningful interpretation, and then a description of the type of name followed by its value. In the example above, the platform is 1.0 (Macintosh Roman), the language code is 0x0000 (English), the name type is 1 (font family name), and the value of the name is "Times New Roman". The TrueType specification gives complete lists of all the various codes employed.

After this list of names, list dumps the uninterpreted contents of the cmap encoding table(s) ("subtables") present in the font. These subtables can take several different, formats. The first few lines of the listing will show the platform ID, how many glyphs are encoded, and the format; this is followed by a dump of the (perhaps lengthy) encoding tables. The encoding tables dumped in this form are difficult to interpret; you should use the afm command to list encodings in a more readable form. A typical entry would appear as:

    cmap subtable 3.1 (Microsoft UGL char set w/ Unicode
		indexing)
    Maps 334 glyph(s) via glyphIdArray:
    format=4 length=1436 version=0 segCountX2=188
	searchRange=128
    entrySelector=6 rangeShift=60

followed by lists of numbers for the endCode, startCode, idDelta, idRangeOffset, and glyphIdArray arrays.

Next comes horizontal header (hhea) information, which constitutes some horizontal metrics for the font as a whole.

font M N afm -

Outputs the names, metrics, and an encoding table for platform M.N on the standard output, in the form of a Adobe Font Metrics (AFM) file. (See below for the meaning of the numbers M and N). This produces an ASCII file you can modify with a text editor or by other methods.

Note that a TrueType font can have several encodings, one for each of the various values of M.N, which are associated with various operating system platforms and encoding standards, and that the AFM output will vary significantly depending on which encoding you select.

The afm command generates its output in terms of the PostScript character names given in the font's "post" table, and combines these with the cmap encoding you specify. This form is not directly given in the TrueType font, since ttf_edit is projecting (in the set-theoretic sense) the cmap table with the post table.

In this release, ttf_edit supports post formats 1.0 and 2.5, which implicitly give PostScript glyph names, and format 2.0, which explicitly gives glyph names and is required for encodings used in Windows. (The post format describes how the glyph name list is stored in the font file; it is altogether different from the platform ID.)

A typical AFM output line for a character's metrics looks like:

    CH <0020> ; N space ; WX 278 ; B 86 0 195 716 ;

meaning that hexadecimal character code 0x0020 has the name "space", with an advance width of 556 units, and a bounding box with corners at coordinates (172,0) and (390,1432) units, the units being PostScript's 1/1000 em units.

Since the AFM format uses 1/1000 em to measure metric values, versus the TrueType usage of 1/unitsPerEm units, ttf_edit applies scaling and rounding to the TrueType metric values when reporting them as AFM metrics. The conversion is computed as follows:

    AFM_Value = int((TrueType_Value*1000+unitsPerEm/2)/unitsPerEm)

The afm command assumes the font creator has correctly computed and stored metrics in the font's metric tables, and takes the advance width and bounding-box metrics from tables in the font file, not by analyzing the glyph shapes. If the metrics (such as a bounding box) given in the font are inaccurate, the command output will unwittingly report the inaccurate value.

The encoding information for each character consists of the hexadecimal code paired with the name. The width and bounding box are glyph measurements of no significance to the encoding; however these metric values may be useful as input to other tools that read AFM files.

Besides the per-glyph encoding and metric information, the AFM output will also contain per-font information and any horizontal or vertical kerning information present in the TrueType font. The AFM output should be complete enough for use with other AFM-reading tools. As such it is a useful, standardized abstract of the TrueType font information. The following global font information items are output, to the extent they are available in the TrueType font:

    FontName
    FullName
    FamilyName
    Weight
    CapHeight (yMax of letter "H", if present)
    XHeight (yMax of letter "x", if present)
    FontBBox
    Version
    Notice
    EncodingScheme
    UnderlinePosition
    UnderlineThickness
    ItalicAngle
    IsFixedPitch

The encode operator (see below) accepts AFM files as a specification for re-encoding a TrueType font. Thus you can use the afm command to create an AFM file for a font, modify the encoding specified by that AFM file using an ASCII editor or other tool, and reflect the encoding changes back into the original font with the encode command.

font afm-filename M N encode font

Re-encodes a font object (that is, modify the TrueType cmap encoding table in memory) according to AFM file afm-filename for platform ID M and platform-specific ID N, where M and N are integers. Only the in-memory version of the font (that is, the font object), and not the underlying file, is modified; the gen operator should follow after if you wish to produce a modified TrueType font file.

The encode command changes only the encoding of the target font. The encode command will not alter any other information in the font, such as names, metrics, or kerning, even though the AFM file may specify such items. Other commands, such as rename and setaw, perform such changes.

Some values for M and N are as follows. Almost all western fonts contain encodings for ID's 3.0 or 3.1 (for Windows) and/or 1.0 (Macintosh).


Encoding Table Platform ID Numbers
 M.N Description

3.1 Microsoft UGL (Unicode or ANSI) encoding (default). (This is the encoding Windows uses for ordinary, textual fonts.) These are the fonts which appear in a Windows font selection dialog which filters out all but textual fonts. For example, Windows 95/98 will display only these fonts in its system font selection list in the screen appearance property settings.
3.0 Microsoft unspecified encoding (that is, a random or indeterminate) encoding. This is the encoding id Windows uses for symbol fonts (like Symbol). A font with this encoding will not appear in a Windows font selection dialog which displays only textual fonts, since Windows does not expect the random encoding to be meaningful for text usage. Note that the while the Windows font "Symbol" uses cmap format 3.0, it is not strictly correct to say of all 3.0-cmapped fonts that they use "symbol" encoding, as if that implied that the font contained a certain set of symbols. But it is common terminology to call such fonts, "symbol fonts", which really only means that the font could contain any random set of glyphs, but probably does not contain the ANSI or Unicode set. If you think about it, the word "symbol" here is just doubletalk, and is practically a synonym for "glyph". So saying "symbol font" is like saying "glyph font", which is absurd. Sorry for getting so peevish about this.
0.N Apple Unicode; meaning of N is undefined (typically, N=0)
1.N Macintosh; N is the script number (typically N=0). What "script" means is not in the TrueType standard, so it must be an issue of Macintosh design.
2.0 ISO 7-bit ASCII
2.1 ISO 10646 (Unicode)
2.2 ISO 8859-1 (Latin-1)

The TrueType fonts supplied with Windows 95/98 and NT (like Times New Roman) actually follow a subset of Unicode that is much bigger than the ANSI set, so it is strange that they use 3.1 cmaps instead of 2.1. But no doubt doing it correctly would have broken something in Windows or in old Windows applications.

TrueType fonts can have a number of encoding tables. Windows fonts typically have the 3.1 (ANSI/Unicode) or 3.0 encoding (Symbol), plus the Macintosh 1.0 encoding; this is an attempt at cross-platform encoding compatibility.

It is apparently an error (or at least undefined behavior) to install a font in Windows having both a 3.0 and a 3.1 encoding table in cmap, since Windows treats a font differently in one case versus the other, and expects only one or the other to be present. (A correct TrueType font could indeed have both such tables, and there are even good reasons for making such a font.) Moreover, when changing a Microsoft-platform encoding, one must change the OS/2 table usFirstCharIndex and usLastCharIndex entries to match the new encoding's entries, except that (according to the TrueType spec when describing these entries in the OS/2 table specification) Windows pretty much expects that a 3.1 table will (!) have usFirstCharIndex=0x0020 and usLastCharIndex=0xf002. In the 1.66 revision of TrueType 1.0, the OS/2 table has a new version 1 that adds two entries ulCodePageRange1 and ulCodePageRange2, which flag which code pages are supposed to be present in the font (bit 0 indicates the Latin 1 code page 1252, etc.). The interpretation seems to vary depending on whether a 3.1 versus a 3.0 encoding table is present; this is a mess of non-orthogonal considerations since 3.0 versus 3.1 was supposed to indicate Symbol versus ANSI code pages anyway (there is, after all, a separate 2.1 encoding platform to indicate Unicode, instead of 3.1 growing into Unicode).

If the previous paragraph frightens or confuses you, fear not. The encode command in ttf_edit will automatically adjust the OS/2 table values for you, so that Windows will find your new font palatable!

Often an encoding will not cover all of the glyphs defined in the font, and ttf_edit will report:

    Encoding covers X of Y glyph(s) in this font

Where X is less than Y. This is normal in a multi-platform font, where the glyph set in the font covers the needs of various platforms. You can see which glyphs are unencoded by generating an AFM encoding file and examining the characters at the end of the CharMetrics section having code -1.

An encoding may map more than one character code to a given glyph; indeed, in many situations this is desirable. Examples: (1) An all-caps font could map both the upper- and lower-case ASCII code for each letter to the upper-case rendering of the glyph. (2) An unfinished font does not contain certain glyphs, and you desire to map the codes for those missing glyphs to print as a blank space instead of the missing-glyph box, so you map those codes to "space". (3) Special codes like space (space character) versus nbspace (non-breakable space) can map to the "space" glyph.

AFM files input to ttf_edit do not need to be sorted by character code, although this sorting is required by the AFM format standard. Any metric or other information in the AFM file will be ignored by ttf_edit when it applies the encode command.

font M N MM NN platform font

With respect to the M.N encoding, alters the encoding platform id from M to MM and the encoding id from N to NN. In other words, the M.N encoding is moved to be the new MM.NN encoding. This alters the encoding id in the cmap entry, and all entries in the name table for that id as well.

The destination encoding MM.NN cannot already exist in the font. To swap two encodings, use 3 platform commands to move the first encoding to a temporary id, such as 99.99, and then move the second to the first id, and finally the temporary to the second id. To re-arrange a larger set of encodings, you must move in a dance of similar fashion. Confused? See the examples below.

font ttf-filename gen -

Generate a new font file from a font object, writing to the file named ttf-filename.

If you have edited anything in the font object, the command will generate a new, altered version of that table in the new font file. For example, if you changed the encoding with the encode command, gen will generate a new "cmap" subtable in the new font file.

The command will copy (verbatim) those tables which you have not modified, exactly from the original. For example, the "glyf" table (which ttf_edit has no way of modifying) will be reproduced verbatim in the new font file.

This command will not modify the original file from which the font object came from. In fact, the command will not write over any existing file, so you have to make sure that a file named ttf-filename does not already exist.

font free -

Discards a font object by freeing the memory it occupied and closing the associated file. This should not normally be needed unless a script loads many fonts and ttf_edit exits with an "Out of memory!" message, or file pointers are exhausted. The font object must not have any dup'ed copies remaining on the stack. However, another font object on the stack could independently refer to the same font file, if the object were created by repeating the file name and font command.

item ... stack item ...

Writes text representations of every object on the stack to the standard output file, but leaves the stack unchanged. This is useful for debugging scripts of commands.

... force ...

Forces any subsequent font command to continue reading an input file as a TrueType font, even if the file contains errors or inconsistencies. Without force, ttf_edit will exit if it discovers an inconsistency in the header of an input TrueType font file (except bad checksums, which are simply announced). With force, ttf_edit will attempt to make the best interpretation it can, but a truly corrupted file can crash the program.

TrueType fonts are often not "exactly correct" in format. Commercial font-editing tools, font foundries, and even Windows itself have been distributing benignly faulty fonts. In these cases you must give ttf_edit a force command before the font command.

If ttf_edit exits because your input font is corrupted, the font is probably not going to be readable. We provide force just to make sure you can try.

A checksum error alone in an input font will not cause ttf_edit to exit, so you don't need to use force just for that.

Make sure that you specify force before the intended font command.

filename run -

Executes contents of the file identified by filename; in other words, interprets the characters in that file as a ttf_edit language program.

Comments in the file begin with '%' (percent sign), as in PostScript; that is, blank lines and a '%' followed by anything else on a line will be ignored.

Strings tokens may be surrounded by matched parentheses, which allows you to specify string tokens which contain white space. Use the characters "\)" to put a left parenthesis within a parenthesized string token.

After the file is interpreted, ttf_edit resumes taking commands just after the run command. Files may run other files in a nested fashion.

Although run itself leaves nothing on the stack, the program executed by run may alter the stack arbitrarily.

While files are allowed to have lines of any length, individual tokens in input files are presently limited to 256 characters in length.

font M N Lid Nid NameString rename -

Sets the name for platform M.N, language ID Lid, and name ID Nid to NameString. The four numbers M N Lid Nid correspond to the output from the list command, and specify the platform M.N, language ID Lid, and name ID Nid.

Some commonly used platform-language combinations are:

    M N   Lid    Nid  Meaning
    ---------------------------------------------------
    1 0  0x0000  0-7  Macintosh 1.0 Roman English
    3 1  0x0409  0-7  Microsoft 3.1 US English

(The full list of language ID's is in the Microsoft TrueType specification.)

The value of Nid ranges from 0 to 7 as follows:

Nid Meaning
---------------------------
 0   copyright notice
 1   font family name
 2   font subfamily name
 3   unique font ID
 4   full font name
 5   version string
 6   PostScript name
 7   trademark notice

The NameString, which you specify in ASCII, will be padded with zero bytes to extend it to Unicode format, which is the implication for names on certain M.N platforms, namely Microsoft 3.x or Apple Unicode 0.x (but not on Macintosh 1.x or ISO 2.x).

Use the list command to see all of the names defined for a font.

The easiest way to change a name entry for a font is to (1) redirect the output of the listname command (see below) to a file (which will give the existing ID and names), (2) edit the file into a script of renaming commands, that is, by prefixing with commands to create a font object and appending the rename command, and (3) apply the script with the run command.

There is presently no way to delete name entries or to add new entries that do not already exist in a font; in other words, ttf_edit assumes you are editing a font that originally had all the appropriate name slots, such as would be output by a graphical font editor.

There is also presently no way to specify true Unicode strings for names. The ASCII names you give will be converted to Unicode as appropriate, but all characters will be from the ASCII code page of Unicode (that is, the most significant byte of each 16-bit code will simply be padded with a zero).

font M N Lid Nid listname -

Displays the name for platform M.N, language ID Lid, and name ID Nid. The four numbers M N Lid Nid correspond to the output from the list command, and specify the platform, language, and name, as described in the rename command.

The output of listname is in a form suitable for re-use as arguments to a rename command:

    % Platform
    % Language
    % Name description
    [M] [N] [Lid] [Nid]
    (Text of name in parentheses)

The comments (starting with %) describe the name; the four numbers [M] [N] [Lid] [Nid] specify the name indexes; and the parenthesized string gives the text of the existing name in the font object.

By prefixing a dup command, appending a rename command, and editing the string, you can redirect the listname output to a file for hand-editing as further input to ttf_edit via run to easily alter the value of a name.

To list all the names for a font, use the list command.

NOTE

The following two commands, import and export, allow you to arbitrarily read or replace the contents of a TrueType table without any tests or guarantees of the validity of the resulting font. While ttf_edit will recompute checksums and otherwise make a valid overall file when gen'ing a font with import'ed tables, the correctness of contents of the imported table(s) is your responsibility. It is widely known that corrupted font files can cause severe problems, or even crash, operating systems such as Windows.

font table-name filename import font

Reads raw data for the table table-name in the font object from file filename. If you later gen a font file from the font object, ttf_edit will recompute the correct checksum for the table and use the imported data for the generated font file.

Due to a limitation of how ttf_edit uses certain tables (cmap, name, OS/2, and post), you should not use a font object after importing tables into it, except to import more tables or to generate a new font file with the gen command that reflects the imported tables.

If you want to edit or otherwise use a table that has been import'ed, you must first create a temporary font file from the font object with gen and then create a new font object from the temporary file with the font command. For example, if you want to use afm to create an encoding file from an import'ed cmap table, you would do the import, gen a temporary file, font the temporary file (creating a new font object), and finally afm the new font object.

The table must already exist in the font; ttf_edit can replace existing tables with outside data via import but cannot [yet] insert new tables.

Note that after the import command, unlike export, the font object is retained on the stack; you do not have to dup the font object if you will be needing it again.

font table-name filename export -

Writes raw data from table table-name in the font object to file filename. The file filename must not already exist.

The data is exported from the font object, not the original file, so the exported table will reflect any intervening editing that you may have done on the font object. Even an unmodified table may be generated and exported in a slightly different form (but functionally equivalent and without loss of information) compared to the original file's table; for example, exported tables will always be padded to 4-byte multiples.

Note that after the export command, unlike import, the font object is no longer on the stack; you should dup the font object if you will be needing it again.

font glyph-name getmetrics AW LSB xMin yMin xMax yMax

Pushes the six TrueType metric values for the glyph named glyph-name from font onto the stack; these values are: the advance width, the left side bearing, and the bounding box coordinates. Unlike the output from the afm command, the values are given in TrueType units of 1/unitsPerEm, not PostScript units of 1/1000 em.

Since ttf_edit does not (yet) implement any stack arithmetic, the getmetrics command is useful chiefly for manually examining metric values of individual glyphs using the stack command.

font glyph-name AW setaw -

Sets the advance width of glyph glyph-name to AW units of 1/unitsPerEm (that is, in TrueType units), where AW is an integer. The left sidebearing and bounding box metrics of glyph-name are not changed.

Using the setaw command on a monospaced font results in a slightly larger font file, because the font is no longer monospaced and will require a longer table of explicit advance-width metrics.

Changing a glyph's advance width with setaw may also imply a change to the per-font maximum advance width and/or minimum right sidebearing values in the hhea table. To maintain consistency, ttf_edit automatically computes and updates the appropriate hhea maximum values, but it only does so when you create a font file with the gen command. Therefore, the hhea maximums reported by the list command may not be current if you previously changed metric values with setaw; you can ensure correctness of any list or getmetrics command output by first applying gen (even if only to a dummy file) to any font object that undergoes metric editing. These cautions only apply to the aggregate metrics (such as maximum widths over the whole font) in hhea; the individual glyph metric values reported by getmetrics always correctly reflect the cumulative effects of any editing.

Examples

The command language of ttf_edit might seem obscure until you see some examples of how things work.

ttf_edit arial.ttf font free

Loads the font arial.ttf, creating an in-memory font object, and then frees the font object. This doesn't do anything.

Wait, I lied, it does check that file arial.ttf is a valid TrueType font file, in which case you will get no messages. (An invalid file will show messages.) If you omit the free command, you will get a message that the stack was not empty when ttf_edit exited.

Please remember that slightly invalid TrueType font files are very common, and even issue forth from the biggest font foundries, operating system purveyors, and font editing applications. Although ttf_edit will inform you of things like bad checksums and incorrect offset tables, many such errors do not affect the correct operation of the font. In any case, you can used ttf_edit to clean up such troubles. See the force command below.

If an input font contains errors, ttf_edit will complain and exit. Use the force command before the font command in such situations. In the above example we would have done it thusly:

    ttf_edit force arial.ttf font free

ttf_edit arial.ttf font list

Loads font arial.ttf and lists various information about it in text form on the standard output. This is a typical first thing to do with a font file, since you thereby learn, for example, whether a Windows font uses a 3.0 (Symbol) or 3.1 (ANSI) encoding, which you will need to know for commands like afm. You can redirect the standard output of any ttf_edit run to a file to save the output: ttf_edit arial.ttf font list > arial.txt or just read it on the screen, one page at a time: ttf_edit arial.ttf font list | more It is always safe to look at fonts in this way; that is, the font file will not be changed in any way. The only way to write on a font file with ttf_edit is with the gen command, and that will only write new files.

ttf_edit arial.ttf font 3 1 afm > new.afm

Writes an AFM file for arial.ttf in file new.afm.

ttf_edit macfont.ttf font 1 0 afm > mac.afm
ttf_edit blatz.ttf font mac.afm 1 0 encode newblatz.ttf gen

The first command generates a Macintosh-encoding specification file mac.afm from an existing Macintosh font macfont.ttf. (We could also obtain a Macintosh-encoding AFM from the ttf_edit encoding-files kit without resort to this trick.)

The second command inserts this mac.afm encoding as a new encoding for the 1.0 platform (the encoding accessed by Macintosh platforms) into a second font, blatz.ttf, and creates a new font file newblatz.ttf which contains an old encoding(s) and the new Macintosh encoding.

Thus if blatz.ttf was a Windows-only font, we have converted it for use on a Macintosh.

ttf_edit arial.ttf font new.afm 3 1 encode newarial.ttf dup gen font 3 1 afm > new2.afm

Loads font file arial.ttf, modifies the encoding for Windows according to file new.afm, creates a new font file newarial.ttf, loads this new font, and writes an AFM file for the re-encoded font in file new2.afm. The files new.afm and new2.afm should show the same encoding.

ttf_edit symbol.ttf font 3 0 3 1 platform symbol31.ttf gen

Loads font file symbol.ttf, changes the 3.0 encoding to a 3.1 encoding, and generates a new font file, symbol31.ttf. The new font is not reencoded, it just has its old encoding in the ANSI (or Unicode) platform; now it appears as an ANSI font instead of a symbol font in your applications. See note below on arbitrary font re-encoding.

ttf_edit arial.ttf font 3 1 99 99 platform 1 0 3 1 platform 99 99 3 1 platform narial.ttf gen

Swaps Arial's 1.0 (Macintosh) and 3.1 (Windows) encodings, producing a modified file narial.ttf. The result would not work then on either system without some more re-encoding steps, which would just bring you back to the start.
ttf_edit arial.ttf font 3 1 afm > new.afm
[Edit new.afm to delete the line "CH <0065> ; N A ;"]
ttf_edit arial.ttf font new.afm 3 1 encode newarial.ttf gen
Makes a new font file newarial.ttf, like Arial but with the letter "A" unencoded (and therefore inaccessible).

ttf_edit myfont.ttf font myfont.cmd run myfont2.ttf gen

Loads the font myfont.ttf, applies the commands in the file myfont.cmd to the loaded font, and generates a new version of the font in file myfont2.ttf. The command file myfont.cmd might contain lines like this, for example:
    %----------------------------------------------------
    %	Change the copyright notice in a TrueType font
    %	(for both Macintosh and Microsoft platforms)
    %
    dup 1 0 0 0 (Copyright 1998 Your Name Here) rename
    dup 3 1 0x0409 0 (Copyright 1998 Your Name Here) rename
    %----------------------------------------------------

These commands would change the copyright notice in myfont2.ttf to read "Copyright 1998 Your Name Here". No other contents would have changed.

ttf_edit myfont.ttf font 3 1 0x409 4 listname

Displays the full font name of font myfont.ttf.

ttf_edit arial.ttf font semicolon getmetrics stack pop pop pop pop pop pop

Pushes the metric values for the glyph named semicolon from font arial.ttf onto the stack, displays the stack contents, and empties the stack. The output will resemble the following:
    stack[5] = integer 1062 <-- top
    stack[4] = integer 387
    stack[3] = integer -290
    stack[2] = integer 170
    stack[1] = integer 170
    stack[0] = integer 569
Which indicates an advance width of 569 units, a left sidebearing of 170 units, and a bounding box having corners at coordinates (170,-290) and (387,1062) units; "units" is understood to be the TrueType unit of 1/unitsPerEm.

Contrast this output with the AFM output for semicolon from the afm command on the same font:

    CH <003b> ; N semicolon ; WX 278 ; B 83 -141 189 519 ;
Here the slightly larger AFM units of 1/1000 em yield proportionately smaller values: an advance width of 556 units, a left sidebearing of 166 units, and a bounding box having corners at (166,-282) and (378,1037).

Hints and Tips

It is helpful to understand some basic aspects of the TrueType font file standard:

TrueType font files are binary files consisting of a set of tables in various formats, preceded by a table directory listing the tables present in the file and the offsets to the start of each table.

The TrueType standard describes formats for about 20 types of tables, some of which are required for a valid font, and some of which are optional. Users may also define their own table formats, and the standard is designed such that TrueType-reading software will correctly ignore or pass-through tables of unknown format or semantics.

Each table has a four-character name. TrueType 1.0 format defines the following tables as required for a valid font:

Name	Description
----	-----------
cmap	Character-to-glyph mapping.  This is where the font encoding
	tables reside.  Internally to the font, all glyphs are known
	by their index (that is, the sequence of their appearance in
	the "glyf" table); the "cmap" table maps codes used by the
	"outside world" to glyph indexes inside the font.  Unlike
	a PostScript Type 1 font, a TrueType font can contain any
	number of encodings in cmap subtables; for example, a font
	might have a 16-bit Unicode encoding for use in Windows and
	an 8-bit Macintosh encoding; each system using the font selects
	the appropriate encoding subtable.
glyf	Glyph data, giving quadratic Bezier contours for each of the
	glyphs.  Composite glyphs have a simple composition program
	based on other glyphs, instead of contours.
head	Font header.  Global information on the font, such as a
	revision number, overall bounding box, smallest readable size,
	etc.
hhea	Horizontal header.  Global information for horizontal layout,
	such as typographic line gap, maximum advance width and
	minimum sidebearings.
hmtx	Horizontal metrics.  Table of advance widths and left
	sidebearings for each glyph.
loca	Index to glyph contour location.  File offsets within the glyf
	table for each glyph index.
maxp	Maximum profile.  Quantities related to the memory requirements
	for the font: number of glyphs, maximum number of contours in
	any glyph, etc.
name    Font naming.  Lots of names for the font, giving various
	combinations of host platform, encoding, and national
	language.  This is very confusing since so many different
	conventions can apply to correctly naming a font, depending on
	the host operating system and page-description language.  Some
	platform conventions are ambiguous or contradictory, further
	adding to the confusion.  And if this were not confusing
	enough, all names in the table can be synonymously reproduced
	in an assortment of national languages (English versus French,
	etc.).  The names are all Unicode strings, to best support the
	multilingual naming.  There are also entries for copyright and
	trademark notices.
post    PostScript information.  These are the important naming and
	global-metric entries that a PostScript Type 1 font would use
	for the font.  Most important for our purposes is a table of
	PostScript character names for the glyphs in the font, which
	allows us to refer to glyphs by textual names.  We can thereby
	re-encode or display encodings using a readable encoding table
	format such as AFM, or apply set-theoretic operations to the
	various encodings.
OS/2    OS/2- and Windows-specific metrics.  Although the standard
	lists this as mandatory, it is not always present in fonts
	that Windows accepts as valid.  This contains a lot of global
	font characteristics in a form peculiar to Windows and OS/2
	(which characterize fonts in an obsolete manner due to their
	bit-mapped ancestry), such as superscript/subscript positions,
	weight (normal, bold, etc.), width (condensed, normal,
	expanded), etc.  Also contains a Panose characterization table.

	Another funny Windows behavior with respect to the OS/2 table:
	instead of testing which characters exist in a font's encoding,
        Windows uses the usFirstCharIndex and
        usLastCharIndex values in the OS/2 table as presumptively
        correct, despite the actual encoding contents.  So if you change the
        encoding, these must be changed accordingly (ttf_edit does
	this).

	The OS/2 table is really the Windows-platform-specific ad hoc
	information.  You might remember that the revision to Windows
	that became Windows NT was originally a joint project of
	Microsoft and IBM, which was known as OS/2.  When the two
	firms had a falling out, IBM retained the OS/2 name for its
	operating system product, and Microsoft renamed their
	derivative operating system to be Windows NT.  The name "OS/2"
	was so imbedded in fonts, applications, and the systems, that
	it was necessary to retain it as a table name within TrueType
	fonts.  What a crazy business, eh?

	As if that were not ironic enough, Apple has adopted the
	OS/2 TrueType table into the GX extensions, chiefly for the
	Panose classification information.

The TrueType 1.0 format also defines the following optional tables:

Name	Description
----	-----------
cvt	Control values.  These are the manifest constants for use
	by the font's hinting instructions, comparable to the
	BlueValues in a Type 1 font.
fpgm	Font program.  Similar to cvt but the manifest constants are
	global to the font, as opposed to per-character values.
hdmx	Horizontal device metrics.  Gives integer advance widths
	pre-computed for various font sizes, typically those associated
	with screen display.
kern	Kerning.  Kerning pairs and values.
LTSH	Linear threshold table, a complementary table to hdmx.
	Gives size thresholds at which advance widths should be
	scaled linearly, instead of being scaled with hinted widths.
prep    Control value program.  An initialization subroutine, called
	at each point size or linear transformation change, and before
	each glyph is rendered.
WIN	Reserved name for future use.  Apparently this was a name
	Microsoft wanted to use instead of "OS/2" when Microsoft and
	IBM parted ways over the OS/2 software.  However it was never
	adopted, probably to avoid incompatibility with old fonts and
	with font editors.
VDMX	Vertical device metrics.  Gives ascender and descender
	maximum heights when hinting is applied, since these
	dimensions can be a bit larger than the linear scaling of the
	font's global dimensions.  This is important since the
	dimensions are typically used by systems to clip characters,
	and a linear (unhinted) value could result in lost pixels.
FOCA	Reserved name for IBM Font Object Content Architecture data.
	This does not have much relevance to today's applications
	and no doubt appears as a result of IBM's early participation
	in the TrueType standard.  This was apparently an attempt to
	provide an interface to EBCDIC and other IBM-proprietary
	encodings.
PCLT	Hewlett-Packard Printer Control Language (PCL 5) information.
	This is a set of dimensions and names for the font which
	would appear when downloading the font to a PCL device.
	This is analogous to the post table minus the character
	names.

Revision 1.65 of the TrueType 1.0 Font File Specification included three new tables for embedded bitmap data:

Name	Description
----	-----------

EBLC    Embedded Bitmap Location Table.  Identifies the sizes and
	glyph ranges of the font's embedded bitmaps and contains
	offsets to the glyph bitmap data.

EBDT    Embedded Bitmap Data Table.  Stores the glyph bitmap data
	in a number of different formats, using Apple's QuickDraw GX
	`bdat' table as the format standard.

EBSC	Embedded Bitmap Scaling Table.  Specifies the bitmap point
	sizes that can be generated by scaling embedded bitmaps up or
	down.

Microsoft promulgated an extension to TrueType, the TrueType Open format, to add the following optional tables to the original TrueType 1.0 standard, in order to better support fonts for non-Latin languages like Arabic, Japanese, or Chinese (the following taken verbatim from the Microsoft TrueType Open specification):

Name	Description
----	-----------
GSUB    Contains information about glyph substitutions to handle
	single glyph substitution, one-to-many substitution (ligature
	decomposition), aesthetic alternatives, multiple glyph
	substitution (ligatures), and contextual glyph substitution.
GPOS    Contains information about X and Y positioning of glyphs to
	handle single glyph adjustment, adjustment of paired glyphs,
	cursive attachment, mark attachment, and contextual glyph
	positioning.
BASE    Contains information about baseline offsets on a
	script-by-script basis.
JSTF    Contains justification information, including whitespace and
	Kashida adjustments.
GDEF    Contains information about all individual glyphs in the font:
	type (simple glyph, ligature, or combining mark), attachment
	points (if any), and ligature caret (if a ligature glyph).

Apple promulgated an extension to TrueType, the TrueType GX format, to add a variety of optional tables to the original TrueType 1.0 standard. The GX extensions cover a hideous set of ad hoc features, with the essential purpose of implementing every typographical feature of every language known to man, as well as patching over all the flawed designs and typographical naivete of the original TrueType format (as an extreme example, there is a "rebus" table, which maps words to glyphs that are equivalent rebus pictures). There is better support for fonts of non-Latin languages like Arabic, Japanese, or Chinese (see http://support.info.apple.com/gx/GXFF/chap0.html). GX tables can also enhance Latin fonts with features like small caps, fractions, ligatures, superiors and inferiors, lowercase numerals, swash alternates, fleurons, borders, and other effects, although Windows does not provide system support for GX-based features. There is no specific requirement for a TrueType font to be a "TrueType GX" font, other than that it possess at least one of the following tables:

Name	Description
----	-----------
acnt    The accent attachment table. This table provides a
	space-efficient method of combining component glyphs into
	compound glyphs.

avar    The axis variation table. This table allows changes in the way
	variation axis values get mapped into a normalized space.

bdat    The bitmap data table. This table provides a collection of
	bitmaps for all of the bitmapped glyphs in the font.

bloc    The bitmap location table. This table provides information
	about the availability of bitmaps at specified point sizes. If
	a bitmap is included in the font, its location in the 'bdat'
	table is included.

bsln    The baseline table. This table sets the primary baseline and
	the positions of other baselines for your font.

cvar    The CVT variations table. This table contains an indexed list
	of control values for your font that can be accessed by
	instructions.

feat    The feature name table. This table allows you to include the
	font's text features, the settings for each text feature, and
	the 'name' table indexes for common (human readable) names for
	the features and settings.

fdsc    The font descriptors table. This table allows applications to
	take an existing run of text and allow the user to specify a
	new font family for that run of text. A new style that best
	preserves the font style information using the new font family
	will be created using data in this table.

fmtx    The font metrics table. This table identifies a glyph and its
	associated control points that can be used to control
	linespacing metrics.

fvar    The font variations table. This table allows you to define
	global information concerning which font variations axes are
	included in your font and the coordinates for named locations
	in the style space.

gvar    The glyph variations table. This table allows you to build
	styles into the font itself, as required to provide font
	variations.

just    The justification table. This table governs the types of
	behavior your font may exhibit when justified, such as how
	much white space is distributed between words and between
	glyphs.

lcar    The ligature caret table. This table contains information that
	allows an application to place a caret through the middle of a
	ligature, which gives the user the ability to edit the glyphs
	that make up a ligature.

mort    The glyph metamorphosis table. This table governs the
	transformations that can be applied to a font, such as
	ligature formation or Indic-style rearrangement.

opbd    The optical bounds table. This table contains the information
	about the optical edges of the glyphs in your font.

prop    The glyph properties table. This table provides information
	about certain glyph properties, such as whether the glyph can
	hang off the edge of the line and its directionality class.

trak    The tracking table. This table governs interglyph kerning
	according to point size and track number, instead of the
	identities of the pair of glyphs.

vhea    The vertical header table. This table contains information
	required for vertical fonts. This information is general to
	the font as a whole.

vmtx    The vertical metrics table. This table contains metric
	information for the vertical layout of each of the glyphs in
	the vertical font.

For Macintosh compatibility, a TrueType 1.0 font, also called a "simple font" in the Macintosh mileau, must contain the following nine basic tables (according to Apple's documentation for the Macintosh "TrueEdit" table-editing application):

    cmap	Character to index mapping 
    glyf	Glyph data 
    head	Font header 
    hhea	Horizontal header 
    hmtx	Horizontal metrics 
    loca	Index to location 
    maxp	Maximum profile 
    name	Name 
    post	PostScript 

These four additional tables (for instruction data) are optional for Macintosh compatibility:

    cvt		Control value 
    fpgm	Font program 
    hdmx	Horizontal device metrics 
    prep	Preprogram 

For most tables, ttf_edit doesn't care about what's in them and just copies them verbatim when doing a gen. So ttf_edit only modifies the table directory and cmap, name, and OS/2 tables, which are the items which change with an encoding change. All other tables and the offset table itself are copied through unmodified.

The current release supports cmap subtable formats 0 (byte encoding table), 6 (trimmed table mapping) and 4 (segment mapping to delta values). These two formats cover the encodings of almost all TrueType fonts for Microsoft Windows and Apple Macintosh. Not supported are formats 2 (high-byte mapping through table, typically used for Asian typefaces) and 6 (trimmed table mapping). You will get messages from ttf_edit if you do a "font" command on a font containing an encoding table in an unsupported format, since ttf_edit will at that time attempt to interpret all the cmap subtables. If the unsupported subtables are not the one(s) you are displaying or reencoding, then you can safely ignore the messages.

When using AFM format files or fragments, ttf_edit is concerned only with the StartCharMetrics/EndCharMetrics section. When reading AFM files with the encode operator, ttf_edit will ignore all other sections as well as any "Comment" lines within the metrics section. The afm operator in ttf_edit creates a complete AFM file, including comments describing the output.

The AFM output from ttf_edit generally follows the Adobe Font Metrics File Format Specification, version 4.1, 16 Oct 1995. This document, or a later version of it, should be available on the Adobe FTP site (see below for the URL), Since TrueType fonts can be 16-bit encoded, while Type 1 base fonts can only encode 8 bits, this output is more a convenient representation of the encoding than something that could actually appear in a Type 1 base font. The AFM output shows hex codes via the CH parameter instead of the more common decimal C representation, as the hex is more recognizable vis a vis the usual encoding tables (Adobe added CH in revision 3.0 of the AFM format). If you use the AFM output with an AFM-reading program, make sure the program understands the CH format; almost all AFM files use C exclusively and not the later CH code representation, and some AFM-reading tools are no doubt non-complaint with the AFM specification in this regard.

The AFM output will list as unencoded any glyph in the TrueType post table (PostScript glyph names for the font) which is not encoded in the selected M.N encoding. The AFM format represents an unencoded character with a code of -1. For example, you might see a line like, "C -1 ; N blatz", which means the font has an unencoded glyph named "blatz" (that is, unencoded in the M.N encoding you specified; such unencoded glyphs are typically encoded in some other M.N encoding, otherwise they would be unreachable in the font). Of course, you can encode such unencoded glyphs in your own new encoding.

Encodings in TrueType fonts map character codes to glyph indexes, and so are specific to the glyph table in a specific font. The mapping from glyph indexes to PostScript names may or may not be present in the TrueType font.

It is possible that an encoded character does not have a PostScript name, either because the post table is missing altogether, or the post table does not name all of the glyphs in the font. For example, CJK fonts for Asian characters do not name all the thousands of glyphs. In such cases, ttf_edit will use an invented name for glyphs which do not have an explicit PostScript name in the font file. We call this the "fall-back" name, and it allows you to manipulate glyphs in a font which otherwise would have no name. When reading and writing AFM encoding files, ttf_edit uses the fall-back name for any unnamed glyph. This fall-back name is of the form .glyphindexXXXX, where XXXX is the hex value of the 16-bit glyph index in the font. For example, .glyphindex0014 is the fall-back PostScript name for the 20th glyph (hex 0x14 = 20 decimal) in the font. You may use the fall-back name in an input AFM file (such as with the encode command) to specify a glyph by its index, even if the glyph has a PostScript name. However, ttf_edit will output the fall-back name in an AFM file (such as from the afm command) only if a PostScript name is not defined in the font file's post table.

Remember that glyph indexes are simply random serial numbers for the glyphs in the font, and have no fixed relationship to the character encoding numbers. In fact, two otherwise identical fonts might have a different ordering of glyphs to represent the same encodings of the same glyphs. Fall-back names are really just a last-resort method to get a "handle" on unnamed glyphs.

The integer argument given in the AFM StartCharMetrics will only be an approximation. The value is actually the number of unique glyphs in the font, and will typically be a few counts short of the actual number of character metric lines due to glyphs that are encoded more than once. You should count the exact number of lines in the AFM fragment and modify the integer value accordingly if the fragment is to used in a context where the proper value may be critical. Since ttf_edit ignores the StartCharMetrics argument, you can feed AFM fragments from ttf_edit back into ttf_edit without fixing them up at all. It is possible that other AFM-reading utilities don't need a correct StartCharMetrics value, either. Most AFM-reading tools, including ttf_edit, just ignore it.

The file name "-" (hyphen) on the command line means the standard input or standard output. You can thus redirect font file names with "<" or ">" and use them in stack commands as the name "-". However, you cannot use a UNIX pipe for a font file, since the ttf_edit uses random access (fseek's) on TTF files. (Actually, piping may succeed in DOS or WIN32 console versions, since these systems simulate pipes with temporary files.)

You may specify integers in decimal, octal (leading "0"), or hexadecimal (leading "0x") formats, when giving arguments to commands such as encode.

In AFM files, C fields must give decimal numbers (like 65 or -1) and CH fields must give hexadecimal numbers (like <0041>). Negative one (-1), indicating unencoded characters, can only be represented in decimal (C) format. Both C and CH can appear in AFM files from ttf_edit.

The ttf_edit program will not overwrite an existing file with the gen command. You must specify a name for an output file which does not already exist. This protects against inadvertent modification of existing fonts. If you want to modify a given font file, you must either output the modified font file to another name and rename the result, or else rename the input font file and output the modified font file to the original name.

While the ttf_edit stack language is similar to PostScript, it is a very small language which does not implement the control operators, dictionaries, etc., of PostScript; thus it is not a complete programming language. To perform complex, programmed operations, one can write a program in, say, PostScript or AWK, and have an interpreter output a "flat" script of ttf_edit commands. With the run command, you can call upon files containing scripts of ttf_edit commands.

The output from the list command truncates the high-order byte from the characters in Unicode name strings to (hopefully) create a single-byte-per- character ASCII string. This is only an approximation, but it will work unless names contain Unicode characters above code 255.

Re-encoding a Windows font from a random encoding: Let us say we have a randomly-encoded symbol font, like Symbol, which we wish to re-encode as a text font. Such a font will start with a random 3.0 cmap, which we want to convert to an ANSI 3.1 cmap. There are several steps to such a re-encoding: first, extract the original (random) encoding to an AFM encoding file symbol.afm and an equivalent symbol.cod:

    ttf_edit symbol.ttf font 3 0 afm > symbol.afm
    awk -f afmtocod.awk symbol.afm > symbol.cod

Second, change the codes in symbol.afm to reflect a Windows ANSI encoding, creating a new, joined encoding in AFM file nsymbol.afm:

    joincode unicode.cod symbol.cod synonyms > nsymbol.afm

Finally, move the 3.0 encoding to 3.1, re-encoding the font, and generate the new font file:

    ttf_edit
	symbol.ttf font		% load the font object
	3 0 3 1 platform	% change the platform
	nsymbol.afm 3 1 encode	% reencode for new encoding
	nsymbol.ttf gen		% generate the new font file

Now this new font will appear to be an ANSI or Unicode font to Windows (a 3.1 versus 3.0 encoding being the crucial difference as far as Windows is concerned). Since Windows and Windows applications assume that a 3.1-encoded font is fully and correctly populated, if you don't re-encode or re-encode improperly or re-encode only partially or the font is not fully populated, the font will have different or missing characters from what Windows expects.

The following changes a symbol-encoded font (symbol.ttf) to an ANSI-encoded font, but instead of using joincode to re-encode to the ANSI platform, you simply shift all the the codes all down by 0xf000 by hand-editing the AFM-format encoding specification:

ttf_edit symbol.ttf font 3 1 afm > _.afm

[ Now hand-edit _.afm to change codes CX to CX <00??> ]
ttf_edit symbol.ttf font 3 0 3 1 platform _.afm 3 1 encode new.ttf gen

Understanding the AFM metrics

A frequently-asked question is:

The AFM spec says "All measurements in AFM ... files are given in terms of units equal to 1/1000 of the scale factor (point size) of the font being used ...." How do we know what the scale factor of the font is?

We typically think of the scale factor as the "point size", but strictly speaking it is a dimensionless quantity. That is, a scale factor of 10 results in glyphs that are the 10-point size; the scale factor itself in this case is a dimensionless 10, not 10 points. The scale factor is, among other things, the argument to the scalefont operator you would choose if you were to render the font in PostScript at a desired size.

The unscaled units (that is, a scale factor of 1, which means the font is scaled to a tiny 1-point size) in an AFM file for the coordinates and metrics in the font are defined to be the physical value of 0.001 points (1/72000 inch). If you scale the font by 10 to get the "10-point size" then the coordinate units are 10 times 0.001 pt, namely, 0.01 pt (1/7200 inch).

If you have an "M", say, that the AFM specifies to have an advance width of 800 units, at a scale of 10 it has an advance width of 800 * 1/1000 * 10 = 8 pt, or 8/72 inch. A "point" (pt) here always means the PostScript sense of exactly 1/72 inch.

The rules for TrueType are slightly modified in that the PostScript 1/1000 becomes 1/unitsPerEm, where unitsPerEm is the TrueType measure of granularity for coordinates and distances in the font, typically 2048 (FUnits per em), which is a value the font specifies in the head table. Thus metrics in TTF files, or reported by Windows, are proportionately scaled unitsPerEm/1024 times the metrics in AFM units, so that ttf_edit applies the inverse 1024/unitsPerEm proportion when writing the AFM values.

Understanding Messages

Warning: filename: invalid checksum; file possibly not a TrueType font

TrueType font files contain checksums and magic numbers to allow programs like ttf_edit to test whether a file is a TrueType font file and whether the file is intact. This message appears when ttf_edit detects that a font file has an incorrect checksum, indicating that the file may have been corrupted. However, ttf_edit will press on to try to read the file as a font, which will fail or cause ttf_edit to hang if you have given it a file which is not a font.

Have you inadvertently supplied the name of a file which is not actually a TrueType font?

Warning: filename: invalid offset table (<reason>)

A TrueType font file begins with an offset table that is crucial to reading the rest of the font file. You will get several of these messages and a checksum message if you try to font a file that is not actually a TrueType font.

The possible reasons are:

When you see this message, ttf_edit will exit with an error, unless you use the force command to direct it to continue.
Warning: filename does not contain TrueType 1.0 magic number

File filename has a proper TrueType header, but does not have the expected "magic number" in the "head" table. If the TrueType format were to be extended to a later version that ttf_edit does not anticipate, you would see this message. You may need a more up-to-date edition of ttf_edit.

Warning: on exit, stack was not empty

The program exited with items left on the stack. May indicate your commands were not all correctly positioned.

Warning: encoding duplicates char code N

The encoding specified in an encode command gives more than one character name for a given code.

Warning: This font contain more PostScript names (32,767) than I can reliably encode

The font contains more PostScript character names than ttf_edit can handle. Commands such as encode will not work completely.

ttf_edit: exiting due to error

A command failed, so ttf_edit stopped executing the stack.

open_file: file name "-" must have mode "r" or "w"

You used the standard input for a file name where not allowed.

open_file: writing to existing file filename not allowed

You attempted to write to a file, such as with the gen command, that already existed

Cannot open "filename" in mode "mode"

The file could not be found or you do not have permission to read or write it.

Cannot seek to loc N!

The file was too short, or otherwise corrupted.

Cannot cmap character code N!

The TrueType font probably has a corrupted encoding.

No "name" table in font "filename"!

The table does not exist in the font file.

No table number N in font "filename"!

The table having index N is missing from the font file.

Cannot find encoding M.N in cmap table

The encoding you specified (such as in an afm command) is not present in the font.

Cannot handle cmap subtable format N

The font file contains an encoding in a format which ttf_edit does not support.

build_encoding: N char(s) in "encoding-file" do not appear in the font

The encoding file you supplied contains N characters which have names which are not in the font's post (PostScript names) table.

do_post: warning: empty 3.0 "post" table, fallback names will be used

The "post" (PostScript glyph names) table in the font was deliberately made empty by the font creator; ttf_edit will use fallback names instead for any operations requiring glyph names.

warning: the "post" table is empty or in an unsupported format
warning: the "post" table is missing or in an unsupported format

To operate on a font, ttf_edit must know the number of glyphs in the font. This number is specified redundantly in a font by the maxp table and (by implication) in the post table (since each glyph must have a name). If these two enumerations disagree, or if the post enumeration is missing or unusable, then ttf_edit will have trouble manipulating the font with any operators the use glyph names.

do_post: warning: post format 2.5 not supported

The "post" table use a format not supported by ttf_edit, namely a subset of the Macintosh set of 258 glyphs.

Warning: No PostScript font name for AFM output

The TrueType font does not contain a PostScript font name, and will be missing from the AFM file output, which ttf_edit will nevertheless attempt to complete.

Warning: ignoring kern format [number]

The kerning subtable(s) in the TrueType font are in a format other than the most common Format 0 (used by Windows and OS/2). Only Format 0 subtables are used by ttf_edit.

fontfile: warning: numGlyphs is 0
The font contains no glyphs.
fontfile: unknown hhea metricDataFormat: num
The horizontal header table specifies a metric data format number num, which is unknown from the TrueType specification used for ttf_edit.
fontfile: no glyph in this font has a contour!
According to the TrueType specification, only glyphs with contours should be considered when ttf_edit finds the maximum and minimum metric values for a font. If the font contains no glyph with a contour, then ttf_edit cannot properly compute metric values for the gen command.
read_glyph: glyph num: warning: composite glyph(s) ignored
The current version of ttf_edit does not edit glyph shapes, although it does load and interpret them, unless they are composite glyphs, in which cases the shapes are ignored. Since ttf_edit does not (yet) do anything with the shape information, this message is of no consequence. (Encoding and metric data for composite glyphs is completely handled.)

Priority Wishlist

AFM-to-TrueType metric conversion in afm should use proper rounding for negative values.

A fallback name may specify a glyph index in an encoding given to encode. The encode semantics do not check that the indexed glyph actually exists in the file; that is, the glyph index may exceed the number of glyphs present in the font.

Does encode require a post table, when fallback names would be sufficient?

A way to edit the post table; that is, a way to replace a given post name for a glyph with another name. Since the old name might be implied from the Mac glyph ordering, and the new one not, this requires extensibility and convertability in the post data structure.

A trick M.N value for afm that reports glyph indexes vs post names.

Example of using rename

[All other current priority items have been implemented.]

Long-term Wishlist

Binary editing commands. These would work on tables which are otherwise uneditable, but still loaded and stored by ttf_edit, and which are simply vectors of various values, like the OS/2 table. The commands would allow you to read and write table values, in all the varieties of TTF primitive data types, at a given offset into a table. Logical primitives (or, and, not, etc.) and arithmetic should also be available. Enhancing the PostScript-like interpreter with /def abilities would allow use of symbolic offsets. One might even envision a general binary file editing tool which subsumed ttf_edit, which would use a file format specification language to understand the syntactical format of binary files and methods for altering them.

Provide a way to set the OS/2 table's (Version 1) ulCodePageRange1 and ulCodePageRange2 bit vectors. Perhaps even a subset-matching the presumes a valid Unicode encoding and deduces the presence of populated code pages thereon (which would require an enormous and probably difficult-to-construct set of glyph lists that are associated with each code page).

Document (check existing text above first) the conventions for encodings, viz., platform 3.1 implies codes 0x20-0xfe and 0x20XX, 3.0 implies 0xf020-0xf0fe, etc. Explain how this affects the appearance of the font in the Win95 interface, e.g., only 3.1 with 0x20 fonts show up in the window properties selections; and how this affects the magic mapping behavior in 8-bit applications. See TrueType spec re the OS/2 table.

Check that with Corel Draw we can have a complete font editing application. Note apparent Corel bugs like invalid checksums. Document Corel behavior on various encodings and platform combinations. Explain how Corel is a pretty lame all-around font tool, since it only edits one glyph at a time, and is unable to do kerning or hinting; but might make a workable poor-man's font editor with ttf_edit.

Documentation example on how to re-encode in a two-pass fashion, using only absolute Unicode values, by exporting the existing encoding in an AFM file, using an awk script, and importing the AFM encoding.

A way to combine glyphs from two or more fonts into a new font; and a way to delete glyphs altogether from a font (not just remove them from encodings).

If we recognized strings versus commands (such as with the PostScript syntax of parenthesized strings) we could avoid the problem that no file name can have a name which is one of our commands.

Nice item would be an encoding "correlator" which would determine the closest encoding based on its matching contents, considering synonyms. How to configure the "table of encoding tables"? How to make it fuzzy so that set-matching is stable and forgiving?

A way to edit glyphs and hinting, perhaps checking them out of a file and using an accessory program (GUI) as a simple graphic editor. Would require defining a text format for representing the TrueType geometric language.

An importer for TeX PK files which can be inserted as embedded bitmap (EBLC/EBDT/EBSC) tables. This would make TrueTeX compatible with PK fonts, while solving the device-independence and soft font problems in doing so. But does Windows support bitmapped TrueType fonts?

A way to specify which format a new encoding takes.

More stack operators.

Should map the code pages and Unicode pages present into the bit vectors in the OS/2 tables, both versions 0 and 1.

Needs to show new OS/2 info in list.

A way to delete glyphs from the font, making the font smaller. This implies quite a bit of work, since the encoding and post tables must all be commensurately updated.

The ability to delete old names or insert new names in the "name" table via the rename command, instead of just changing existing values. Also the ability to specify any Unicode character in a Unicode-platform name.

A better file parser, without the limits on line length, or the restriction on strings crossing input lines.

A way to redirect error messages (stderr) into a file on non-Unix platforms (does Win95/98/NT shell now allow this like Unix always has?).

A way to read the binary fork from Mac font resource files directly.

A discussion of all the differences between a "Macintosh" versus "Windows" font. See "refont" and "TT converter" shareware that allegedlly converts between them.

OS/2 table twiddling.

Upgrade the import command to allow insertion of new tables, and to re-elaborate tables such as cmap, removing the caveat in the import command description. The best way to do this would be to reorganize the do_font() function as a sequence of do_import()'s on the elaborated tables.

Acknowledgments

Thanks to 290 testers as of June 19, 1996:

a8507122, administrator, Royden Akerley, Alaa, Ian Alexander, Amazon, Heresh Ariai, arniemon, Arturo, David Asaph, Simon Barber, Ted Barrett, baynes, Joe Beda, Nelson H. F. Beebe, Vladimir Benko, Michael Benson, Murray Bent, David L. Bergart, Neil Beshoori, J. Albert Bickford, Sandro Boege, Boris, Kaspar Brand, Wolfgang Brandes, Bjorn Brox, bruno, Bob Buckland, Bill Burton, Douglas Busch, Dave Cantelon, Lee Carter, Albert Chang, Len Charlap, Chi-Yang Cheng, chetty, Ruhul Chowdhury, Curtis Clark, Carl Clarke, Bernard F. Collins, Elio Corbolante, Mary E. Cosaboom, Francesco Cosentino, cOSmO, Olinsky, Craig, Larry Craven, Martin Crosley, Play With Daemon, Greg Dale, david gibbins, Simon Daykin, Stefan Decuypere, stewart dejournett, Dr. Dimitris Dellis, Printed Circuit Board Factory - Engineering Departement, Detwiler_D, Luc Devroye, Marc van Dijk, Hans Dinsen-Hansen, Hans Doering, Brian J. Doyle, Mariusz Drewniak, duong, Karen Dupre, Miguel Duran, duruz, E-Signature, Alan Edwards, Efrem, Henry Ekweani, Emilio, Eureka, EuroFONT, Ste Eurotour, Karl F. Everitt, faculty, Philip J. Ferguson, fidel, FRINK, Alexander Frink, Joe Fugate, Christopher Fynn, Dennis D. Gaskill, Peter J. Gentry, John M. Gibby, Michael Goldberg, Val Golding, Bill Good, Avrum Goodblatt, Global Graphics, Carlos Roberto Guilherme, Lucans, Gunars, Jeff Hamilton, Jon Hanauer, Pierre HANSER, David Harmanec, Eric Hedman, Gregor Heinrich, Henry, Ahmed Hindawi, Richard Hodges, Roy Hong, Charles Hughes, Chen Hung-Yih, Andrew Hunt, Lonn Hunter, Neil Hunter, Erik Ingenito, TMA Japan, Jason, Byrial Jensen, Xiaofei (Geoffrey) Jiang, jun.gu, k3079e3, Ove Kaaven, Kedar Karmarkar, Bernhard Kaulfuss, Akihiro KAYAMA, Konstantin Kazarnovsky, Damon Kelly, ZHOU KEMING, Ian Kemmish, Kerim, John Kiapecos, Kkirou, klamp, Jeff Klassen, Sergei M. Komarov, Peter Ulrich Kopper, korn, Denis Koudrjavtsev, Rob Kramer, Mikhail Krasnov, Paul Kravchenko, Sergej Kravchuk, Marcin Krolak, kryon, Henry Kubacki, Tomasz Jan Kudrewicz, Douglas de Lacey, Dr DR de Lacey, Chung-Pang Lai, Rafi Latowicz, Antoine Leca, Franz Lehner, Werner Lemberg, Seaborg Leo, Gen S. Lin, Joerg Lindner, Wolfgang Lipp, Dr. Liping Liu, N. R. Liwal, Gunars Lucans, M.EFE, Sairan M.Kikkarin, Jakub Marchwicki, Marilene, marin, MediaPlan Marke_Robinson, Allen Marshall, Sam Mathew, matt, Mike Meagher, mikunis, Nicholas Milas, Tiro Technical / Wm Ross Mills, misawa, Pavel Janik ml., Alladdin Mohsin, Michael Morris, Micheal D. Morrow, multi-presses, Philippe Mussi, Heiko Mvller, naskrent, Gwidon S. Naskrent, Mark Nelson, Matthias Neumann, Panyrak Ngamsritragul, Sung Nho, Alex Nicolaou, Joe O'Dowd, Ronald Ogawa, Myongho Oh, osmik, Vijay K. Patel, Jonathan Paterson, Von Marsoner Paul, Pawelek, pdaliu, Jonathon Pearce, Jeff Pek, Laurence Penney, Ramsn Contreras Peqalver, Pedro A Pereira, Thomas Pfohe, Frank Pichardo, Paul Pietquin, John Pigott, Bogusz Piliczewski, PKiel57, pmb, Sergio Pokrovskij, vadim polonichko, paul n. price, Andreas Prilop, profirst, Medusa project, Fyodor A. Prokhorov, Tomek Przechlewski, psiepen, Andy Putnins, Andrea Quitt, Dr. R.Kalyanakrishnan., Alexander Razumov, Art Du Rea, Rudi Reichmann, Juergen Richter, Rickey456, Benjamin Riefenstahl, Robert.Wilhelm, Robomark, Lindsay Rollo, Chuck Rowe, Matt Rowe, Roberts Rozis, Deyan Rudev, Mike Russell, Christopher Russo, Andrey Rzhetsky, Stefan Salbach, Zdenek Salvet, Maciej Samsel, Hilmar Schlegel, Arnd Schmidt, scribble, Dmitry Smirnov - SUN/CIS Novosibirsk SE, Shrinath Shanbhag, Wei Sheng, Mortaza Shiran, Bo Sibbmark, Pamaci~tgr Sidgq|poukor, Christoph Singer, skastholm, Shauni So, Wlad T. Sobol, songtan, Martin Stein, Jon Stenerson, Jerome Stern, Ron Stewart, Kevin Strietzel, Mindaugas Strockis, Phil Sturgeon, Kuo-Chun Su, Mail Delivery Subsystem, hannes sulzenbacher, Joco Carlos Teixeira, Judie Thalacker, Klaus Thaler, Han The Thanh, Don Thorpe, tmptmp, Mike Todd, Tom, Bo-Ming Tong, TPSham, David Truax, Larry Tseng, Adam Twardoch, Tiro TypeWorks, Martin Villwock, vsajko, Norman Walsh, Eric Wang, Peter K. Ward, Trish Ward, Charles Whalen, Glen Wilcox, Karen J Wilgenhof, willadams, Christian Wittern, Frederick J. Wysocki, xingyou, Daniel Yacob, Itzchak Yosef, Yummy, Amr Zaki, Mike Zemina, Mike Zmuda

History

[First beta release]
28 May 1996:
Descriptions in list give language names instead of hex codes. Fixed error where missing post table was called missing name table. Added warning message when stack is not empty at program exit. Added error message for afm when selected cmap table is missing. Added platform command. Bad integer arguments now stop execution rather than just complain.
28 May 1996:
Added OS/2 table export so encode sets us{First,LastCharIndex.
10 June 1996:
Added validation to glyph index lookup in format 4 encodings, so that defective font tables do not cause a stray pointer reference, and to encode 0xffff in any case to missingGlyph. Fixed list to display Apple Unicode platform names as truncated ASCII.
11 June 1996:
Inserts a new encoding id (cmap subtable) when encode calls for it. Checks for sanity on input files. Fixed output tables to align on 4-byte multiples, with zero bytes padding between. Checksums now checked with font, computed and written with gen. Handles cmap format 6 (sometimes appears on Macs).
13 June 1996:
Added M.N to encoding version-unknown message [Second beta release]
18 June 1996:
Fixed bad/stray checksum in gen Added additional input-file sanity-checking and the force command
[Third beta release]
21 June 1996:
Added fall-back PostScript glyph naming convention to provide a handle for un-named glyphs. Corrected OS/2 table output in gen when input contained a version 1 format OS/2 table.
25 June 1996:
Fixed format 6 cmap subtable generation. When doing gen, writes correct offset table values even if input font was bad (that is, used force).
21 Oct 1996:
Added run, listname, and rename commands.
8 Nov 1996:
Added import and export commands.
25 Nov 1996:
[Third beta release]
25 Apr 1997:
Changed import to leave the resulting font on the stack. (Although dup'ing the object before importing would have worked.)
7 Jan 1998:
Now reads (read-only) the maxp table to determine numGlyphs, instead of taking the quantity from the post table, allowing manipulation of fonts without post tables (using fallback names for glyph names). The maxp information appears in the list command output.
10 Jun 1998:
Per-font metrics, names, and kerning table(s) output with afm. (Cross-stream, override, minimum, and reserved1 coverage not handled.)
12 Jun 1998:
list shows OS/2 table values.
02 Sep 1998:
HTML-ized documentation.
15 Sep 1998 (Version 0.9):
Added getmetrics command.
Conversion of TrueType to PostScript metrics in AFM output was possibly off by one PostScript unit when rounding very close to 1/2 unit; this was corrected.
16 Oct 1998 (Version 0.91):
Corrected metric values, which were not properly scaled to unitsPerEm from the head table; the value of 1024 used in CONV should have been unitsPerEm, typically 2048, typically resulting in metrics twice as large as they should have been. Apparently earlier versions of many fonts used with ttf_edit used a unitsPerEm of 1024, so that the hard-wired value of 1024 inadvertently produced correct results in the tests.

References and Links

The official Web page for ttf_edit is http://truetex.com. Support and updates are available there.

Microsoft on typography: http://www.microsoft.com/typography

ftp://ftp.microsoft.com/developr/drg/TrueType contains useful TrueType font development tools. The very useful TTFDUMP program seems to have disappeared, however.

The TrueType Specification Version 1.0 is a difficult but important document for serious TrueType font development. It will help you understand the file organization, which we have tried to introduce above. The specification is available on the MSDN disks and on the Microsoft FTP site above.

As we have followed a simplified PostScript syntax, the PostScript language references themselves are helpful. The authoritative versions from Adobe are only availabled as printed books.

ftp://ftp.ifcss.org/pub/software/fonts/unicode/ms-win/uwpstj.exe etc. for some big fonts.

Adobe's papers on font standards: ftp://ftp.adobe.com/pub/adobe/devrelations/devtechnotes/pdffiles