All of the qwertz document "styles", except bibliographies,
are defined in a single SGML document type
definition (DTD), called
qwertz. It is essentially a SGML reconstruction of Lamport's LaTeX
[Lamport86]. We have not attempted to include every feature
of LaTeX in this DTD, but have included the features we use
regularly. Others may of course find that something they deem
important is missing. We welcome suggestions for improvements or
extensions.
We will be making use of several parameter entities in this DTD:
<!entity % emph
" em | it | bf | sf | sl | tt " >
<!entity % xref
" label | ref | pageref | cite | ncite " >
<!entity % inline
" (#pcdata | f | x | %emph; | sq | %xref)* " >
<!entity % list
" list | itemize | enum | descrip " >
<!entity % par
" %list; | comment | lq " >
<!entity % mathpar " dm | eq " >
<!entity % thrm
" def | prop | lemma | coroll | proof | theorem " >
<!entity % litprog " code | verb " >
<!entity % sectpar
" %par; | figure | tabular | table | %mathpar; |
%thrm; | %litprog; ">
These are just macros used in the definitions of various elements,
to avoid retyping and to ease maintenance. The emph parameter
lists the various kinds of emphasis. The inline parameter is for
the elements which may be used anywhere within the document. The
list parameter is for various kinds of lists. par lists
several basic kinds of elements at the level of paragraphs. The
mathpar parameter includes the elements for displayed
mathematical formulas. The thrm parameter is for the set of
elements used to represent such things as definitions, theorems and
proofs. The litprog parameter is for literate programming
elements. Finally, the sectpar parameter lists the elements
which may occur at the level of paragraphs within sections (or
chapters). Notice that this parameter uses other parameters.
Several kinds of documents may be written using LaTeX: articles,
reports, books, letters and slide (or transparency) presentations. The qwertz DTD
includes two others as well: notes, for documents such as notes to yourself which do not
require a title, sections, footnotes and the like; and manpage, for Unix manual pages.
<!element qwertz o o
(sect | chapt | article | report |
book | letter | telefax | slides | notes | manpage ) >
Notice that sections (sect) and chapters (chapt) may
also be processed separately, before being put together into an
article, report or book.
LaTeX also includes BibTeX, a program for creating
bibliographies whose entries can be easily cited in LaTeX documents.
The qwertz document type for this purpose is described in Chapter 5.
This section describes the SGML entities and elements available in
all qwertz documents.
<!entity % general system -- general purpose characters -- > %general;
Most characters are created just by typing the character wanted on the keyboard. This simple method does not suffice when the character wanted isn't in the character set available, or at least not associated with a key on the keyboard, or when the character currently has special meaning to SGML or, perhaps, TeX. In this section, a fairly large number of general purpose character entities will be presented. Symbols and characters which may be used only in mathematical formulas will be discussed separately, in section math .
When may it be necessary to use of an entity reference to produce some character? There are three cases to watch out for:
Although the SGML standard allows alternative concrete syntaxes to
be defined, we use the so-called reference concrete syntax in
the qwertz document types. In this reference syntax, < is
the start tag open character, and </ is the end
tag open delimiter. The other SGML delimiter authors should be
aware of is &, the entity reference open delimiter of the
reference syntax.
The appropriate entity to use to generate these characters depends
on the context. Normally, use lt to represent < and
amp to get &, when these appear in strings which might
otherwise be interpreted as starting tags or entity references.
However, within the code or verb elements for literate
programming, described in section
litprog
, use the
ero entity to represent & and the etago entity for
the sequence </.
<!entity lt sdata "<" > <!entity amp sdata "&" > <!entity ero sdata "&" > <!entity etago sdata "</" >
In SGML document types short reference maps may be defined which allow single characters to be interpreted as arbitrarily complex sequences of characters, including SGML tags and entity references. Thus, to know precisely when a certain character will be interpreted literally or as a short reference (i.e. macro) for something else, one has to know which map is in effect in the context of the current element. Just about all punctuation characters which are not used as delimiters in the concrete syntax can be used as short reference delimiters:
" # % ' ( ) * + , - : ; = @ [ ] ^ _ { | } ~
For each of these characters, there is an SGML entity which may be used to generate the ASCII character in the printed document, listed in table GPC . Usually, it will not be necessary to use these entities; the character can simply be typed and will be interpreted literally. However, if the results are not as expected, check to see if there is a map in effect at that point in the document in which the character has been redefined. As maps are associated with elements, the section in this manual describing an element will also direct you to a description of the applicable map, if there is one.
As it turns out, one important use of character maps is to generate
exactly the character typed in the printed document. That is, the map
is used to hide the special meaning of the character to the underlying
formatter (e.g. TeX), replacing the character with the formatting
instructions for generating the character. This has been the main use
of maps in our qwertz document type definitions.
<!entity dquot sdata """ >
<!entity num sdata "#" >
<!entity percnt sdata "%" >
<!entity quot sdata "'" >
<!entity lpar sdata "(" >
<!entity rpar sdata ")" >
<!entity ast sdata "*" >
<!entity plus sdata "+" >
<!entity comma sdata "," >
<!entity hyphen sdata "‐" >
<!entity colon sdata ":" >
<!entity semi sdata ";" >
<!entity equals sdata "=" >
<!entity commat sdata "@" >
<!entity lsqb sdata "[" >
<!entity rsqb sdata "]" >
<!entity circ sdata "ˆ" >
<!entity lowbar sdata "_" >
<!entity lcub sdata "{" >
<!entity verbar sdata "|" >
<!entity rcub sdata "}" >
<!entity tilde sdata "~" >
Ideally, it should be possible to hide the conventions of the
underlying formatting system completely. In fact, SGML parsers which
implement the full ISO standard have a feature which makes this
possible. However, the SGML parser we are using does not include this
feature: the only characters which can serve as short references are
the characters allowed for this purpose by the reference concrete
syntax. Unfortunately, this reference syntax does not allow &,
$ and \ to be used as short references, which are all
special TeX characters. Thus, the entities for these three
characters (amp, dollar and bsol) must usually be used
to produce them. (The $ and \ characters may be used
directly within the verb and code elements, discussed
below in section
litprog
. Also, within these elements use
the ero entity to represent & in strings which might
otherwise be interpreted as entity references.)
<!entity bsol sdata "\" > <!entity dollar sdata "$" >
The meaning of the ordinary space character is context sensitive.
Sometimes there is a space within a single word. Such spaces
can be typed using the nonbreakable space (nbsp) entity
to avoid breaking the word at that point at the end of line. There
are also contexts where one wants a certain amount of space to appear,
without it being regarded by the formatter as being space which may be
shrunk in order to clean-up the arrangement of words or characters on
the line. There are three entities for this purpose: emsp
denotes the amount of horizontal space required for the character "M".
An ensp is just half as wide as an emsp, and a thin
space (thinsp) is 1/6 of an emsp. Notice that
these are relative amounts, depending on the font being used.
There are also three different kinds of dashes: hyphen, which
was already mentioned above, is to be used for intra-word dashes, as
in the word "intra-word".
However, the hyphen entity
was not actually necessary here, as the - character was not being used
in this context as a short reference.
ndash is to be
used for number ranges, such as "23–56", and mdash is an
alternative delimiter for parenthetical comments — certainly
you've seen them used this way — perhaps to avoid too frequent
use of commas or parentheses.
<!entity nbsp sdata "~" > <!entity emsp sdata " " > <!entity ensp sdata " " > <!entity thinsp sdata " " > <!entity mdash sdata "—" > <!entity ndash sdata "–" > <!entity hellip sdata "…" >
There are a large set of entities for other Western European languages. Altogether, there are entities for almost all of the foreign language characters in ISO 8859, the Latin 1 character set for Western European languages. Only the four Icelandic characters are missing. Conveniently, these entities are all available in the usual Adobe PostScript fonts, as well as in TeX. Thus, all of the entities defined here can be printed in TeX, on PostScript printers, or displayed on any Latin 1 device. Depending on the computer and editor, it may also be possible to type these Latin 1 characters directly, instead of having to use these entities. A simple filter could translate Latin 1 files into ASCII files, replacing non-ASCII characters by entity references. The entity names chosen here for these characters conform to the SGML standard.
<!entity aacute sdata 'á' > <!entity Aacute sdata 'Á' > <!entity acirc sdata 'â' > <!entity Acirc sdata 'Â' > <!entity agrave sdata 'à' > <!entity Agrave sdata 'À' > <!entity aring sdata 'å' > <!entity atilde sdata 'ã' > <!entity Atilde sdata 'Ã' > <!entity auml sdata 'ä' > <!entity Auml sdata 'Ä' > <!entity aelig sdata 'æ' > <!entity AElig sdata 'Æ' > <!entity ccedil sdata 'ç' > <!entity Ccedil sdata 'Ç' > <!entity eacute sdata 'é' > <!entity Eacute sdata 'É' > <!entity ecirc sdata 'ê' > <!entity egrave sdata 'è' > <!entity Egrave sdata 'È' > <!entity euml sdata 'ë' > <!entity Euml sdata 'Ë' > <!entity iacute sdata 'í' > <!entity Iacute sdata 'Í' > <!entity icirc sdata 'î' > <!entity Icirc sdata 'Î' > <!entity igrave sdata 'ì' > <!entity Igrave sdata 'Ì' > <!entity iuml sdata 'ï' > <!entity Iuml sdata 'Ï' > <!entity ntilde sdata 'ñ' > <!entity Ntilde sdata 'Ñ' > <!entity oacute sdata 'ó' > <!entity Oacute sdata 'Ó' > <!entity ocirc sdata 'ô' > <!entity Ocirc sdata 'Ô' > <!entity ograve sdata 'ò' > <!entity Ograve sdata 'Ò' > <!entity oslash sdata 'ø' > <!entity Oslash sdata 'Ø' > <!entity otilde sdata 'õ' > <!entity ouml sdata 'ö' > <!entity Ouml sdata 'Ö' > <!entity szlig sdata 'ß' > <!entity uacute sdata 'ú' > <!entity Uacute sdata 'Ú' > <!entity ucirc sdata 'û' > <!entity ugrave sdata 'ù' > <!entity Ugrave sdata 'Ù' > <!entity uuml sdata 'ü' > <!entity Uuml sdata 'Ü' > <!entity yacute sdata 'ý' > <!entity Yacute sdata 'Ý' > <!entity yuml sdata 'ÿ' >
The qwertz document types were developed in a German research
center, so we have included entities for the German characters with
shorter names than the entity names used in the SGML standard. Notice
that these are just synonyms for the standard entities, which are also
included.
<!entity Ae 'Ä' > <!entity ae 'ä' > <!entity Oe 'Ö' > <!entity oe 'ö' > <!entity Ue 'Ü' > <!entity ue 'ü' > <!entity sz 'ß' >
Finally, there are entities for a few miscellaneous symbols, such as §, ¶, (c), ¬, ÷, ±, ×, and μ. All of these entities name symbols in the Latin 1 character set. They may be used anywhere within a document. (In particular, the mathematical symbols shown here need not be within one of the formula elements described below, in section math .) The entity names for these, and all the other character entities discussed above, are listed in table GPC . A document which does not include mathematical formulas or graphics and which uses only the character entities defined in this chapter can be displayed or printed using a single Latin 1 font.
<!entity gt sdata ">" > <!entity sect sdata "§"> <!entity para sdata "¶"> <!entity copy sdata "(c)"> <!entity iexcl sdata "¡" > <!entity iquest sdata "¿" > <!entity cent sdata "¢" > <!entity pound sdata "£" > <!entity not sdata "¬" > <!entity divide sdata "÷" > <!entity plusmn sdata "±" > <!entity times sdata "×" > <!entity mu sdata "μ" >
AElig Æ Aacute Á Acirc  Ae Ä
Agrave À Atilde à Auml Ä Ccedil Ç
Eacute É Egrave È Euml Ë Iacute Í
Icirc Î Igrave Ì Iuml Ï Ntilde Ñ
Oacute Ó Ocirc Ô Oe Ö Ograve Ò
Oslash Ø Ouml Ö Uacute Ú Ue Ü
Ugrave Ù Uuml Ü Yacute Ý aacute á
acirc â ae ä aelig æ agrave à
amp & aring å ast * atilde ã
auml ä bsol \ ccedil ç cent ¢
circ ˆ colon : comma , commat @
copy (c) divide ÷ dollar $ dquot "
eacute é ecirc ê egrave è emsp
ensp equals = euml ë gt >
hellip … hyphen ‐ iacute í icirc î
iexcl ¡ igrave ì iquest ¿ iuml ï
lcub { lowbar _ lpar ( lsqb [
lt < mdash — mu μ nbsp
ndash – not ¬ ntilde ñ num #
oacute ó ocirc ô oe ö ograve ò
oslash ø otilde õ ouml ö para ¶
percnt % plus + plusmn ± pound £
quot ' rcub } rpar ) rsqb ]
sect § semi ; sz ß szlig ß
thinsp tilde ~ times × uacute ú
ucirc û ue ü ugrave ù uuml ü
verbar | yacute ý yuml ÿ
Sentences need not be marked up with tags. There is no
sentence element as such. Rather, these are marked implicitly
using the usual conventions for beginning and ending sentences.
Paragraphs are delimited with the p tag. Both the starting tag
and ending tag are optional.
<!element p o o ( %inline | %sectpar )+ >
<!entity ptag '<p>' >
<!entity psplit '</p><p>' >
<!shortref pmap
"&#RS;B" null
"&#RS;B&#RE;" psplit
"&#RS;&#RE;" psplit
'"' qtag
"[" ftag
"~" nbsp
"_" lowbar
"#" num
"%" percnt
"^" circ
"{" lcub
"}" rcub
"|" verbar >
<!usemap pmap p>
Sentences or phrases within paragraphs can be emphasized in a number
of ways. The em tag is used to choose the default form of
emphasis, which is usually italic type, but depends on the
style of the background text. If the background text is formatted in
italics type, as it usually is in definitions, for example, than
emphasized text will be formatted using a plain, roman typeface.
However, various forms of emphasis can be explicitly chosen. These
include: bold face (bf), italics (it),
sans serif (sf), slanted (sl), and
typewriter (tt) styles.
<!element em - - (%inline)> <!element bf - - (%inline)> <!element it - - (%inline)> <!element sf - - (%inline)> <!element sl - - (%inline)> <!element tt - - (%inline)>
The tt element simulates a "typewriter". That is, with a
couple of exceptions, characters are printed exactly as they appear on
the display. This is useful for including small segments of computer
code within paragraphs. See the section on literate programming for
more information,
litprog
.
Sentences within paragraphs can be quoted using the short
quote, (sq) tag, as in <sq>The rain in Spain falls
mainly on the plain.</>, but this is usually not necessary. In
most contexts where one will want to use quotations, there is a map
allowing the " symbol to be used as a short reference for both
the starting and ending sq tags. So one can just type:
"The rain in Spain falls mainly on the plain."
Quotations extending over a number of paragraphs are marked using the
long quote (lq) element. Long quotes are formatted in
LaTeX by indenting the left and right margins. For example,
[Lamport86, pp. xiii]:
The LaTeX document preparation system is a special version of Donald Knuth's TeX program. TeX is a sophisticated program designed to produce high-quality typesetting, especially for mathematical text. …
LaTeX represents a balance between functionality and ease of use. Since I implemented most of it myself, there was also a continual compromise between what I wanted to do and what I could do in a reasonable amount of time. …
<!element sq - - (%inline)>
<!entity ftag '<f>' -- formula begin -- >
<!entity qendtag '</sq>'>
<!shortref sqmap
"&#RS;B" null
'"' qendtag
"[" ftag
"~" nbsp
"_" lowbar
"#" num
"%" percnt
"^" circ
"{" lcub
"}" rcub
"|" verbar >
<!usemap sqmap sq >
<!element lq - - (p*)>
Four types of lists are supported, which differ according to the
type of label used to mark each item in the list. Use itemize
to create a list in which each item is marked with some symbol such as
a dash or bullet. The enum tag is used to create an
enumeration, i.e. a list in which each item is labelled with a number
(or letter) indicating its rank or position in the list. The
list type of list does not label the items at all. Finally, use
descrip to create a list in which each item is labelled by some
tag of your own choice. Lists of various types can nested. For
example:
<itemize>
<item>
A level one item.
<item> Here's level two:
<enum>
<item> A level two item.
<item> Here's level three:
<enum>
<item> A level three item.
<item>Here's level four:
<descrip>
<tag/Red./ Is the color of my true love's hair.
<tag/Blue./ Is a property of some movies.
<tag/Yellow./ Characterizes some forms of journalism.
</descrip>
<item>A last level three item
</enum>
<item>A last level two item
</enum>
<item>A last level one item.
</itemize>
This is formatted by LaTeX as:
Is the color of my true love's hair.
Is a property of some movies.
Characterizes some forms of journalism.
<!element itemize - - (item+)> <!element list - - (item+)> <!element enum - - (item+)> <!element descrip - - ((tag?, (%inline; | %sectpar;)*, p*)+) > <!element item o o ((%inline; | %sectpar;)*, p*) > <!element tag - o (%inline)> <!usemap global (list,itemize,enum,descrip)>
For reasons having to do with our translation into LaTeX, line feeds
within tag elements are translated into spaces, using the
oneline short reference map:
<!entity space " ">
<!entity null "">
<!shortref oneline
"&#RS;&#RE;" null
"&#RS;B&#RE;" null
'"' qtag
"[" ftag
"~" nbsp
"_" lowbar
"#" num
"%" percnt
"^" circ
"{" lcub
"}" rcub
"|" verbar>
<!usemap oneline tag>
Figures and tables are floating elements; they may appear at a
different location in the printed version of the document than in the
input file. There is a location (loc) attribute, which can be
used to influence the location chosen by the formatter. The value of
the loc attribute is a string of up to four letters, where each
letter declares a location at which the figure or table may appear, as
follows:
h.At the same relative location as it appears in the SGML input file (i.e. here).
t.At the top of a page.
b.At the bottom of a page.
p.On a separate page containing only figures and tables.
loc attribute is tbp.
A figure is a graphic combined with an optional caption. Two
types of figures are currently supported. The first, and easiest, is
to use the eps tag to include an Encapsulated PostScript file
in the document. Encapsulated PostScript files are centered
horizontally on the page. The size of the graphic is its "natural"
size; i.e. the size it would have if printed directly on a PostScript
printer. You need only know the name of the file containing the
graphic.
Encapsulated PostScript graphics can be created using a variety of
different editors. If you are using Unix with an X11-based graphical
user-interface, you may want to try idraw, which stores its
documents directly as Encapsulated PostScript files. Other interesting
X11-based drawing program are xfig and tgif.
For example, to include the graphic contained in an Encapsulated
PostScript file named issues.ps, you would type:
<figure>
<eps file="issues">
<caption>An <tt>idraw</> Drawing </>
</figure>
Which would then appear as in figure issues .
Notice that the ".ps" extension is not to be included in the
file attribute of the eps element, but that the actual file
must include the ".ps" extension.
The second possibility is to use the placeholder (ph)
tag to leave space in which to later paste the graphic, in the old,
reliable manner. For example, to leave 10 cm space for
some graphic, type:
<figure>
<ph vspace="10cm">
</figure>
Be sure not to leave a space between the number and the unit of
measurement used, which may be cm, mm or in.
<!element figure - - ((eps | ph ), caption?)>
<!attlist figure
loc cdata "tbp">
<!element eps - o empty >
<!attlist eps
file cdata #required>
<!element ph - o empty >
<!attlist ph
vspace cdata #required>
<!element caption - o (%inline)>
<!usemap oneline caption>
Next, there is a tabular element. Using LaTeX, tabulars
must be small enough to fit on a single page. The current
tabular element has been kept quite simple. It certainly does
not (yet) offer all the flexibility of LaTeX. However, it may well
be that it is sufficient for most users. More complex tables can,
depending on your choice of formatters, be created using LaTeX or
Unix's tbl program, with the x element, or with any
program capable of generating Encapsulated PostScript, which can then
be included using an eps element.
A tabular consists of a number of rows, separated by the
rowsep element, each of which consists of a number of columns
separated by the colsep element.
The format of the tabular is controlled by the column
alignment (ca) attribute. For each column in the tabular
there is a letter in the ca attribute: 1) c for
centered; 2) l for flush left; or 3) r for flush right.
In addition, | can be used to insert vertical lines running the
complete height of the table. This will be made clear in the example
which is coming shortly.
First, however, let me describe the short reference map defined for
tabulars. Rather than typing <colsep> and
<rowsep> explicitly, one can just type | to separate
columns, and @ to separate rows. Also, within tabulars, [ can be used to start a mathematical formula, and " starts short
quotes as usual. (The other short references just hide any special
meaning the character may have to TeX.)
<!entity % tabrow "(%inline, (colsep, %inline)*)" >
<!element tabular - -
(%tabrow, (rowsep, hline?, %tabrow)*, caption?) >
<!attlist tabular
ca cdata #required>
<!element rowsep - o empty>
<!element colsep - o empty>
<!element hline - o empty>
<!entity rowsep "<rowsep>">
<!entity colsep "<colsep>">
<!shortref tabmap
"&#RE;" null
"&#RS;&#RE;" null
"&#RS;B&#RE;" null
"&#RS;B" null
"B&#RE;" null
"BB" null
"&#SPACE;" null
"&#TAB;" null
"@" rowsep
"|" colsep
"[" ftag
'"' qtag
"_" thinsp
"~" nbsp
"#" num
"%" percnt
"^" circ
"{" lcub
"}" rcub >
<!usemap tabmap tabular>
The hline element can be use to draw a horizontal line along
the length of the table, to separate rows.
A table element consists of a tabular followed by an
optional caption. Unlikes tabulars, A table is a floating
"body", like a figure. It may be moved to another (near) location
within the formatted document. A tabular, however, appears at
the same place in the formatted document as in the SGML source file.
<!element table - - (tabular, caption?) >
<!attlist table
loc cdata "tbp">
Here is how table GPC was typed:
<table>
<tabular ca="ll|ll">
ae | &ae | Ae | &Ae @
oe | &oe | Oe | &Oe @
ue | &ue | Ue | &Ue @
sz | &sz | amp | & @
bsol | &bsol | circ | &circ @
.
.
.
Dagger | &Dagger | sect | § @
para | ¶ | copy | © @
mdash | &mdash | tilde | &tilde
</tabular>
<caption><label id="GPC">
General Purpose Characters
</caption>
</table>
The original motivation behind the development of these document types was to create an environment for literate programming in an arbitrary programming language similar to Donald Knuth's WEB system for literate programming in Pascal [Knuth84]. The basic idea is to include the source code of a program inside of its documentation, instead of the other way around: including comments within the source code.
The features offered here to support literate programming, or merely
the documentation of existing programs, have been kept to a minimum.
Snippets of code can be mentioned within sentences using the tt
tag. These are formatted using a typewriter font suitable for
program code, but the spacing and indentation of the code is not
retained. Within tt elements, the only characters which may not
be literally interpreted are $, \, &, and </.
For the $ and \ symbols, always use the dollar and
bsol entities. For the & and < symbols, use the
amp and lt entities if the string in which they occur
could be mistaken for an entity reference, an element start tag or an
element end tag.
To include larger segments of code, retaining its line breaks,
tabulation and spacing, use the code tag or the verb
tag. Within these tags just about all characters are interpreted
literally. The exceptions are:
verb and
code elements, use the ero entity to represent the &symbol in strings which might otherwise be mistaken for entity
references. (Notice that the amp entity is not used to represent &in this context.) etago entity to represent </ in strings
which might otherwise be interpreted as end tags. (Do not use the
lt entity for this purpose here.) Start tags can be typed
literally in this context, without using entities.\end{verbatim} may not occur within
code or verb elements. Presumably this will not often
be a problem. For example, to include the "hello world" C program in a document, just type:
<code>
main ()
{
/* This is the famous hello world program */
printf("hello world\n");
}
</code>
When formatted, spaces and line breaks are preserved:
main ()
{
/* This is the famous hello world program */
printf("hello world\n");
}
Notice that no entities where required in this code.
With few exceptions, it should be possible to just wrap verb or
code tags around existing pieces of code without change.
The idea of literate programming is that the documentation is the program, so there must be some way of extracting the source code from the SGML document. Just how to do this is described in chapter , below.
The user must have a means of indicating which pieces of code are
to be included in the source code, and in which order. Our solution
to this problem is very simple: Only code elements are to
be extracted, and they are extracted in the same order as they appear
in the document. That is, verb elements are not
extracted, and may be used, e.g., for examples or draft versions of
the code included for explanatory or tutorial purposes.
code and verb elements may be formatted differently.
Using our translation into LaTeX, for example, code elements
are distinguished by being bracketed by lines the width of the page.
<!element code - - rcdata>
<!element verb - - rcdata>
<!shortref ttmap
"&#RS;B" null
'#' num
'%' percnt
'~' tilde
'_' lowbar
'^' circ
'{' lcub
'}' rcub
'|' verbar >
<!usemap ttmap tt>
The qwertz document types include elements for describing
mathematical formulas completely within SGML, similar to the system
described in [daphne89]. To start, there are a fairly large
number of entities for mathematical symbols. (The set of entities
chosen are for the symbols available in both TeX and in the
PostScript Symbol font.) Although this may be a minor irritation for
seasoned TeX users, we have decided to follow the naming conventions
for mathematical symbols adopted in the SGML Standard
[Smith88]. The complete set of mathematical symbols currently
defined, including the Greek alphabet are listed in
tables
mathsym
and
greek
, in alphabetical
order.
<!entity % math system -- math symbols -- > %math;
Prime ″ aleph ℵ and ∧ ang ∠
ap ≈ arr ↓ bottom ⊥ bull •
cap ∩ cir ○ clubs ♣ congr &congr;
cup ∪ diams ♦ divide ÷ dot ˙
empty ∅ equiv ≡ exist ∃ forall ∀
ge ≥ hArr ⇔ harr ↔ hearts ♥
image ℑ infin ∞ isin ∈ lArr ⇐
lang 〈 larr ← le ≤ mid ∣
minus − nabla ∇ ne ≠ nequiv ≢
not ¬ notin ∉ nsub ⊄ nsube ⊈
nsup ⊅ nsupe ⊉ nvDash ⊭ nvdash ⊬
oplus ⊕ or ∨ otimes ⊗ part ∂
plusmn ± prime ′ prop ∝ rArr ⇒
rang 〉 rarr → real ℜ setmn ∖
spades ♠ square □ sub ⊂ sube ⊆
sup ⊃ supe ⊇ times × uArr ⇑
uarr ↑ vDash ⊨ vdash ⊢
alpha α beta β gamma γ
Gamma Γ delta δ Delta Δ
epsi ε zeta ζ eta η
thetas &thetas; Theta Θ iota ι
kappa κ lambda λ mu μ
nu ν xi ξ Xi Ξ
pi π Pi Π rho ρ
sigma σ sigmav ς Sigma Σ
tau τ upsi υ Upsi ϒ
phis &phis; Phi Φ chi χ
psi ψ Psi Ψ omega ω
Omega Ω
TeX symbols not in table 2 may nonetheless be generated, by
defining an entity using the mc element. For example, to print
the $\leadsto$ symbol, you could first define an entity,
perhaps using the name adopted for this symbol in the SGML standard:
<!entity rarrw "<mc/<x/\leadsto//">
Of course, this approach is TeX dependent. But this dependency is clearly noted at the beginning of the document, and it would be an easy matter to replace the TeX command for such entities with the appropriate commands for some other formatter.
The mc tag used in this entity definition is for math
characters. The entity could have been defined using only the x
tag described in section
misc
, but it is "safer" to use the
mc tag when defining entities which are only to be used within
formulas, as the SGML parser will complain if they are used elsewhere.
If x were used instead, such errors would first be caught by
TeX.
<!element mc - - cdata >
There are a number of parameters for formulas. These will most likely be of little interest to most users, but are stated here for the sake of completeness.
<!entity % sppos "tu" >
<!entity % fcs "%sppos;|phr" >
<!entity % fcstxt "#pcdata|mc|%fcs;" >
<!entity % fscs "rf|v|fi" >
<!entity % limits "pr|in|sum" >
<!entity % fbu "fr|lim|ar|root" >
<!entity % fph "unl|ovl|sup|inf" >
<!entity % fbutxt "(%fbu;) | (%limits;) |
(%fcstxt;)|(%fscs;)|(%fph;)" >
<!entity % fphtxt "p|#pcdata" >
There are three elements for representing formulas: f, for
ordinary short formulas appearing "in-line"; dm for
displayed formulas to be centered on a line (or lines) by
themselves; and eq for displayed formulas which are to be
numbered sequentially throughout the document (i.e. so-called
"equations").
<!element f - - ((%fbutxt;)*) -(footnote) >
<!entity fendtag '</f>' -- formula end -- >
<!shortref fmap
"&#RS;B" null
"&#RS;B&#RE;" null
"&#RS;&#RE;" null
"_" thinsp
"~" nbsp
"]" fendtag
"#" num
"%" percnt
"^" circ
"{" lcub
"}" rcub
"|" verbar>
<!usemap fmap f >
<!element dm - - ((%fbutxt;)*) -(footnote)>
<!element eq - - ((%fbutxt;)*) -(footnote)>
<!shortref dmmap
"&#RE;" space
"_" thinsp
"~" nbsp
"]" fendtag
"#" num
"%" percnt
"^" circ
"{" lcub
"}" rcub
"|" verbar>
<!usemap dmmap (dm,eq)>
Usually it is not necessary to type the starting and ending tags of
the f element explicitly: [ and ] are short
reference delimiters, allowing one to simply type, for example,
[&alpha &rarr &beta], instead of
<f>&alpha &rarr &beta</f> to represent
α → β.
TeX users will appreciate that this
notation is no more verbous than TeX.
The only characters of interest in fmap are _ ~and ]. _ is a short reference for thinsp, which adds a
little extra horiztonal space. ~ means nbsp, which in
turn denotes a non-breaking space. TeX will not start a new line at
a nbsp. Finally, ] is used to end the formula. The other
characters in this map just protect us from any special meaning TeX
gives them.
The dmmap is much the same as the fmap. There are
just two differences: 1) ] is not a short reference for the f
closing tag (and instead has its literal meaning), and 2) carriage
returns and new lines are replaced by spaces, for reasons having to do
with the way TeX formats formulas. Use the tu element,
defined a bit later, to force line breaks in formulas.
Of course, formulas consist of more than just a string of math
symbols. There are elements for representing fractions (fr),
products (pr), integrals (in), sums (sum), roots
(root) and arrays (ar). Each of these will be described
next.
A fraction consists of a numerator (nu) and a denominator
(de). For example, 12/37 can be written as:
[<fr><nu>12<de>37</fr>]
Of course, this is rather lengthy. For simple fractions such as
this, you may prefer to just type [12/37], which is
formatted by LaTeX in the same way.
On the other hand, if
you are a SGML purist, you may prefer not to do this, as it makes
assumptions about the formatting system being used.
<!element fr - - (nu,de) > <!element nu o o ((%fbutxt;)*) > <!element de o o ((%fbutxt;)*) >
Products, integrals and sums all have similiar structure,
consisting of a lower limit (ll), an upper limit
(ul) and an optional operand (opd).
<!element ll o o ((%fbutxt;)*) > <!element ul o o ((%fbutxt;)*) > <!element opd - o ((%fbutxt;)*) > <!element pr - - (ll,ul,opd?) > <!element in - - (ll,ul,opd?) > <!element sum - - (ll,ul,opd?) >
So, for example,
was typed as:
<dm>
<sum><ll>i=1<ul>n<opd>x<inf>i</></sum> =
<in><ll>0<ul>1<opd>f</in>
</dm>
This example also shows how to represent subscripts, using the
inf tag. There is also a sup tag for superscripts.
For operators with upper and lower limits other than products, sums
or integrals, use the lim element.
<!element lim - - (op,ll,ul,opd?) > <!element op o o (%fcstxt;|rf|%fph;) -(tu) >
For example,
was typed as
<!entity bigcup "<mc>&bigcup</>">
...
<dm>
<lim>&bigcup<ll>i=0<ul>n</>
<opd>{&alpha<inf>i</> &rarr &beta}</>
</lim>
</dm>
Notice that it isn't necessary to type the op tag here.
Roots can be represented using the, what else, root element.
By default, root produces square roots. The n attribute
of root can be used for other roots. For example, type
[<root n=3/x+y/] to get
.
<!element root - - ((%fbutxt;)*) >
<!attlist root
n cdata "">
Arrays, or matrices, consist of a sequence of rows, each of which
contains a sequence of columns. Every row in the array must contain
the same number of columns. Rows are separated by the
arr tag; columns by the arc tag. The array itself is
delimited by the ar tag.
<!element col o o ((%fbutxt;)*) >
<!element row o o (col, (arc, col)*) >
<!element ar - - (row, (arr, row)*) >
<!attlist ar
ca cdata #required >
<!element arr - o empty >
<!element arc - o empty >
This is a place where an SGML short reference map has proven useful:
<!entity arr "<arr>" >
<!entity arc "<arc>" >
<!shortref arrmap
"&#RE;" space
"@" arr
"|" arc
"_" thinsp
"~" nbsp
"#" num
"%" percnt
"^" circ
"{" lcub
"}" rcub >
<!usemap arrmap ar >
Columns can be separated using the | character; rows with the @ character.
For example, this matrix
was typed as:
<ar ca=clcr> a+b+c | uv | x-y | 27 @ a+b | u+v | z | 134 @ a | 3u+vw | xyz | 2,978 </ar>
The column alignment of an array must be specified using the
ca attribute, as shown in the example. For each column in the
array, there is a letter in the ca attribute. There are three
alternatives: 1) c for centered; 2) l for flush left;
and 3) r for flush right.
There remain a few miscellaneous math elements to describe.
sup and inf, for superscripts and subscripts, were
mentioned above. unl and ovl can be used to
underline or overline formulas. rf is used for
identifiers, such as function names (e.g. cos or sin)
within formulas. Similarly, phr is used to delimit phrases of
ordinary text within formulas. (Both of these are necessary, as
strings of characters within formulas denote sequences of variables,
not words.) The v tag can be used to denote a vector,
as in x. Calligraphic characters, such as L, can be
denoted using the fi tag. Finally, line breaks can be inserted
into formulas using the tu element.
<!element sup - - ((%fbutxt;)*) -(tu) >
<!element inf - - ((%fbutxt;)*) -(tu) >
<!element unl - - ((%fbutxt;)*) >
<!element ovl - - ((%fbutxt;)*) >
<!element rf - o (#pcdata) >
<!element phr - o ((%fphtxt;)*) >
<!element v - o ((%fcstxt;)*)
-(tu|%limits;|%fbu;|%fph;) >
<!element fi - o (#pcdata) >
<!element tu - o empty >
<!usemap global (rf,phr)>
There are a number of elements useful for representing
definitions (def), propositions (prop),
lemmas (lemma), corollaries (coroll),
proofs (proof), and theorems (theorem).
<!element def - - (thtag?, p+) > <!element prop - - (thtag?, p+) > <!element lemma - - (thtag?, p+) > <!element coroll - - (thtag?, p+) > <!element proof - - (p+) > <!element theorem - - (thtag?, p+) > <!element thtag - - (%inline)> <!usemap global (def,prop,lemma,coroll,proof,theorem)> <!usemap oneline thtag>
With the exception of proof, these all have the same
structure: an optional thtag followed by some paragraph level
elements. Here is an example:
Alexander's Theorem
Let G be a set of nontrivially achievable subgoals and < an order on G. < is abstractly indicative if and only if it is a linearization of < G * .
This was typed as:
<theorem><thtag>Alexander's Theorem</> Let [<fi/G/] be a set of nontrivially achievable subgoals and < an order on [<fi/G/]. < is abstractly indicative if and only if it is a linearization of [<lim>< <ll> <fi/G/ <ul> &ast </lim>]. </theorem>
The global short reference map, which is the default map in
effect within qwertz documents, allows the " symbol to be
used to start a short quote (sq) and [ to start a
formula (f). Also, ~ is used for non-breaking
spaces. The rest of the short references just serve to hide any
special meaning TeX gives these characters, allowing them to be
directly typed without having to use entity references.
<!entity qtag '<sq>' >
<!shortref global
"&#RS;B" null -- delete leading blanks --
'"' qtag
"[" ftag
"~" nbsp
"_" lowbar
"#" num
"%" percnt
"^" circ
"{" lcub
"}" rcub
"|" verbar>
<!usemap global qwertz>
Places within a document can be marked using the label
element. Labels have an id attribute for naming the label.
The SGML parser will check that these identifiers are unique within
the document, and that they are referenced. That is, the parser will
complain if there is no reference to a label. For this reason, labels
should probably be created on demand, rather than in anticipation of
the need for a reference to the element.
There are two kinds of references: ref for
references to the number of some element, such as a section, figure or
theorem, and pageref, for references to the number of the page
on which the text around the label occurs when the document is
printed. Both types of references have an id attribute for
stating the identifier of the label being referenced. The number of
the element or page will be printed at the place of the ref or
pageref.
<!element label - o empty>
<!attlist label id cdata #required>
<!element ref - o empty>
<!attlist ref
id cdata #required>
<!element pageref - o empty>
<!attlist pageref
id cdata #required>
For example, a reference to the section on miscellaneous elements of this manual, section misc , would be typed as:
... section <ref id=misc>, would be ...
The label itself was typed as:
<sect><heading><label id="misc">
Miscellaneous Elements</>
There are just a couple general purpose elements remaining to be discussed, which don't seem to have found a suitable home yet elsewhere in this manual.
Editorial comments and reminders to oneself can be marked with the
comment tag. These comments will be printed using a different
type style than the body of the text. In the qwertz mapping
into TeX, they are printed using the slanted type style.
If you do not want the comment to be printed, use the standard
SGML notation for comments instead: <!-- … -->.
Finally, there is an "escape" element, allowing you to include raw
formatting code at any place in your document, the x element. This
code will be passed on to the formatter, such as TeX, inline, at the point it appears in
your document. Of course, this "feature" should be used judiciously, as it limits the formatter
independence of the document.
<!element comment - - (%inline)> <!element x - - ((#pcdata | mc)*) > <!usemap #empty x >
Notice that math character (mc) elements may appear within
x elements. This allows you to use SGML entity references for
math characters, to help avoid having to rememember both the SGML and
the formatter's names for these symbols. Other entities may also be used, so
long as they expand to character data.
Articles, reports and books are structurally very similar. They may be formatted differently, of course, but this is of little importance during the writing phase of primary interest to authors. Seen abstractly, each type of document consists of a title page, for such information as the title of the document, the names of the authors and so on, followed perhaps by an abstract, and then by a sequence of chapters or sections. There may be citations, which are references to documents listed at the end, in a bibliography. Perhaps there are one or more appendices. Finally, these documents may also contain footnotes.
Let us first precisely describe the overall structure of these document types, before moving on to describe their various components. The article element is defined as:
<!element article - -
(titlepag, header?, abstract?,
toc?, lof?, lot?, p*, sect*,
(appendix, sect+)?, biblio?) +(footnote)>
<!attlist article
opts cdata "null">
The options attribute (opts) of article
provides a place to state formatting options, which are passed
on to LaTeX. The particular options available depends on the
installation of LaTeX being used, but the following should
always be available:
11pt, 12pt.Set the "normal" font size to eleven, or twelve, point, instead of the default 10 point size.
twoside.Formats the document for printing on both sides of a page.
twocolumn.Formats the document with two columns per page, as is common in the proceedings of scientific conferences, for example.
titlepage.Causes the title page and abstract to be printed on a separate page.
Other options which may be supported include:
dina4.Formats the document for printing on DIN A4 size paper. (As this is the size paper used at our installation, this option is included automatically during the translation.)
german.Causes the TeX hyphenation algorithm to "think German", and sections, bibliographies and such to be labelled using the appropriate German terms.
times, bookman, palatino …Causes the "main" font to be the selected PostScript font, instead of the standard TeX font, Computer Modern, and maps all other type faces to some suitable PostScript font or type style.
For example, the starting tag for some article might be:
<article opts="bookman,11pt">
Reports are just like articles, except that they consist of a
sequence of chapters (chapt), instead of sections
(sect):
<!element report - -
(titlepag, header?, abstract?, toc?, lof?, lot?, p*,
chapt*, (appendix, chapt+)?, biblio?) +(footnote)>
<!attlist report
opts cdata "null">
Books are similar to reports, except that they may not include an abstract:
<!element book - -
(titlepag, header?, toc?, lof?, lot?, p*, chapt*,
(appendix, chapt+)?, biblio?) +(footnote) >
<!attlist book
opts cdata "null">
The options attribute (opt) for report and
book elements is the same as that for articles, just described,
except the titlepage option, which is applicable only for
articles.
The rest of this chapter describes the common elements of articles, reports and books, starting with title pages.
A title page (titlepag) consists of a title, a number of
authors (author) and an optional date (date). The title
may refer to a footnote and may also include a subtitle. If
the date element is omitted, today's date will be printed by default.
To avoid having a date printed, include an empty date element.
<!element titlepag o o (title, author, date?)> <!element title - o (%inline, subtitle?) +(newline)> <!element subtitle - o (%inline)> <!usemap oneline titlepag>
The author element includes the name and, optionally,
institution (inst) of the author. If there are multiple
authors, these are separated with the and tag. Also,
acknowledgements can be expressed using the thanks element.
These are formatted by LaTeX as footnotes on the title page.The author element includes the name and, optionally,
institution (inst) of the author. If there are multiple
authors, these are separated with the and tag. Also,
acknowledgements can be expressed using the thanks element.
These are formatted by LaTeX as footnotes on the title page.
<!element author - o (name, thanks?, inst?,
(and, name, thanks?, inst?)*)>
<!element name o o (%inline) +(newline)>
<!element and - o empty>
<!element thanks - o (%inline)>
<!element inst - o (%inline) +(newline)>
<!element date - o (#pcdata)>
<!usemap global thanks>
Within the titlepag, the title, subtitle,
author and inst elements can be broken into multiple lines
using the newline element or, if you prefer, the nl
entity.
<!element newline - o empty > <!entity nl "<newline>">
The title page of this manual was typed as:
<title>The <tt/qwertz/ SGML Document Types
<subtitle>(Version 1.1 Reference Manual)
<author>Tom Gordon
<inst> Institute for Applied Information Technology (F3) &nl&nl
German National Research Center &nl
for Computer Science (GMD)
Notice the titlepag tags are optional. The simplest title
page would include a title and author:
<title> A Very Short Title Page <author> Snoopy
Articles and reports, but not books, may have an abstract, which consists of one or more paragraphs, including the various kinds of lists, mathematical formulas and elements for literate programming:
<!element abstract - - (p+)>
There are three elements for stating whether or not a table of contents, list of figures or list of tables should be included in the document. These tables and lists are generated by LaTeX. Therefore the contents of these elements is empty. They are only used to specify that the list or table should be included.
<!element toc - o empty> <!element lof - o empty> <!element lot - o empty>
A header element specifies what should be printed at the top
of each page. It consists of a left heading (lhead) and a
right heading (rhead). Both elements are required, if a heading is
used at all, but either may be left empty, so that the effect of
having only a left or right heading can be achieved easily enough.
<!element header - - (lhead, rhead) > <!element lhead - o (%inline)> <!element rhead - o (%inline)>
As we will see, an initial header can be given after the title page. Afterwards, a new header can be given for each new chapter or section. The header printed on a page is the one which is in effect at the end of the current page. So that the header will be that of the last section starting on the page.
The naming scheme we have adopted for sections is a bit different
than that of LaTeX, because the names of SGML identifiers may be
at most only eight characters long. But we think the scheme we have
chosen has its advantages. In books and reports, the top-level
sectional unit is the chapter (chapt). In articles, it
is the section (sect). The lower sectional units are
sect1, sect2, sect3, and sect4, in that
order.
Each section (or chapter) consists of a heading, followed by
an optional header, a number of paragraphs (including such things as
graphics), and then sections of the next lower level.
<!entity % sect "heading, header?, p* " > <!element heading o o (%inline)> <!element chapt - o (%sect, sect*) +(footnote)> <!element sect - o (%sect, sect1*) +(footnote)> <!element sect1 - o (%sect, sect2*)> <!element sect2 - o (%sect, sect3*)> <!element sect3 - o (%sect, sect4*)> <!element sect4 - o (%sect)> <!usemap oneline (chapt,sect,sect1,sect2,sect3,sect4)>
Don't confuse the headers with headings. The heading is
just the text printed at the point where the section begins, naming
the section. The header changes the text printed at the top of
each page.
If there are cross references to the section, put the
label in the heading. For example, you could type:
<sect><heading><label id=mysect>My First Section</>
If a label isn't required, you can leave the heading tag
implicit:
<sect>My First Section
The appendix element marks the begin of a sequence of
appendices. These are chapters or sections, depending on whether the
document is an article, report or book, and differ from ordinary
chapters or sections only in the way the are numbered, and of course
their placement at the end of the document.
<!element appendix - o empty >
The tag for footnotes is, simply enough,
footnote.
To be sure the marker for the footnote is
formatted propertly, be sure not to leave a space between the
character after which the footnote marker is to appear and the
beginning of the footnote element itself.
<!element footnote - - (%inline)> <!usemap global footnote>
Footnotes can appear anywhere within a section (or chapter). The
usemap declaration is required to cancel the lines map
used in title pages.
Literature references can be made using the cite and
ncite elements. The only difference between them is that the
ncite allows a short note to be included in the
reference, for such things as page numbers.
<!element cite - o empty>
<!attlist cite
id cdata #required>
<!element ncite - o empty>
<!attlist ncite
id cdata #required
note cdata #required>
For example, one might type
<ncite id="Bryan88" note="pg.68">
to refer to page 68 of Martin Bryan's
book on SGML. This would appear, using LaTeX, as
[Bryan88, pg. 68] in the printed document.
The id attribute of a cite or ncite is a
reference to an identifier of a BibTeX bibliography file. There is a
qwertz SGML document type for creating such bibliographies,
described below.
The bibliography itself, or list of references, is generated by
including a biblio element near the end of the document, before
the appendix.
<!element biblio - o empty>
<!attlist biblio
style cdata "qwertz"
files cdata "">
The files attribute of biblio is a list of the names
of the bibliographies used, separated by commas. The names should not
include any file suffixes, such as ".bib" or ".sgml". For example,
to cite publications on artificial intelligence and cognitive
science, where the bibliograhies are maintained in two files,
ai.sgml and cogsci.sgml, you would type:
<biblio files="ai,cogsci">
The style attribute determines how the bibliography is
formatted. Five styles are supported:
plainEntries are sorted alphabetically and labeled with numbers.
unsrtThe same as plain except the entries are ordered as they
appear in the document, rather than alphabetically.
alphaThe same as plain, except that labels are made from the author's
name and the year of publication.
abbrvThe same as plain except that first names, month names, and
journal names are abbreviated.
qwertzThe same as plain except that all words of the entry are
capitalized exactly as they appear in the source file of the
bibliography. The plain style applies capitalization rules
which are inappropriate, e.g., for German titles.
The slides element is for making a series of slides or, more
commonly, overhead transparencies. Although you may often prefer to
use some other program for preparing presentations, this approach has
its advantages when you want to include parts of an existing article
or book on your transparencies. You can just "cut and paste" the SGML
source from an article onto a slide. You may also prefer this
approach if your presentation includes mathematical formulas, to be
able to take advantage of TeX's excellent mathematics typesetting.
<!element slides - - (slide*) >
<!attlist slides
opts cdata "null">
Each slide consists of an optional title, followed by one or more
slpar elements:
<!element slide - o (title?, p+) >
Notice that not every element available in an article or book is also available here. In particular, there are no sectioning elements, cross references, footnotes or a bibliography. Our translation into TeX does not use SliTeX, so as to allow slides to include tables and figures.
The title element will be centered on the line. You can break
up the title into multiple lines with newline elements. The
various type style elements, such as em and bf, can also
be used here; indeed anywhere on a slide.
The letter element is for making letters and e-mail
messages. Just how a letter is formatted may depend on whether it is
a business or personal letter. If it is a business letter, it may be
printed to appear as if the company's letterhead stationery had been
used.
The structure of a letter can be quite complex, but most the elements to be described here are optional. Using an example from [Lamport86], a simple letter would be typed like this:
<letter>
<from>
R. (Ma) Dillo
<address> 1234 Ave.~of the Armadillos &nl
Gnu York, G.Y. 56789
<to>
Dr.~G. Nathaniel Picking
<address> Acme Exterminators &nl
33 Swat Street &nl
Hometown, Illinois 62301
<cc> Jimmy Carter &nl
Richard M. Nixon
<opening> Dear Nat,
<p>
I'm afraid that the armadillo problem is still
with us. I did everything ...
... and I hope we can get rid of the nasty beasts
this time.
<closing> Best regards,
</letter>
The from and to elements are for the sender's and
receiver's names and addresses, respectively. The address may be
either a street address, using address, or an electronic mail
address, using email, or both. You may also include a telephone
number, using the phone element. (If you are using your company's
letterhead stationery, it may be that you should type only your
extension, rather than your complete telephone number.) Finally, a
telefax number can be provided, using the fax element.
Notice that in the closing you must type a comma yourself,
if you want one. Also, do not type your name again after the closing;
the name of the sender will be printed after the closing as
expected.
There are several optional elements which may be of interest:
subjectFor the purpose or, well, subject of the letter. If you would like this subject line to appear as "re: …", for example, you must type the "re: " yourself, as part of the subject.
sref, rref, rdateThese are tags for the sender's reference, receiver's
reference and receiver's date where you can include whatever
code is used by your, or the recipient's, company or institution to
uniquely identify letters. For example, if this letter is a response
to some other letter, you may use the rref and rdate
elements to identify the original letter. There is no sdate
tag, as the date this letter is printed will be included in the letter
at some appropriate place by the formatter.
ccThis used to be an acronym for "carbon copies", which were to be
sent to persons other than the principal recipient of the letter. The
cc tag can be used to list these other recipients, even though
the copies they receive today are perhaps printed by a laser printer
on recycled paper. As in the above example, you can separate the
names of these recipients with newline elements (using the
nl entity if you prefer).
enclUse this tag to list enclosures. These can also be separated
with newline elements, or simply with commas, if you prefer.
psA postscript, not to be confused with PostScript, can be included
with this tag. Any kind of element which can appear in the body
of the letter (i.e. sectpar elements) can also be used here.
To summarize, here are the relevant SGML declarations:
<!entity % addr "(address?, email?, phone?, fax?)" >
<!element letter - -
(from, %addr, to, %addr, cc?, subject?, sref?, rref?,
rdate?, opening, p+, closing, encl?, ps?)>
<!attlist letter
opts cdata "null">
<!element from - o (#pcdata) >
<!element to - o (#pcdata) >
<!usemap oneline (from,to)>
<!element address - o (#pcdata) +(newline) >
<!element email - o (#pcdata) >
<!element phone - o (#pcdata) >
<!element fax - o (#pcdata) >
<!element subject - o (%inline;) >
<!element sref - o (#pcdata) >
<!element rref - o (#pcdata) >
<!element rdate - o (#pcdata) >
<!element opening - o (%inline;) >
<!usemap oneline opening>
<!element closing - o (%inline;) >
<!element cc - o (%inline;) +(newline) >
<!element encl - o (%inline;) +(newline) >
<!element ps - o (p+) >
The structure of a telefax message is the same as for letters and
e-mail messages, except that the fax number of the recipient is,
of course, required, rather than optional.
<!element telefax - -
(from, %addr, to, address, email?,
phone?, fax, cc?, subject?,
sref?, rref?, rdate?,
opening, p+, closing, ps?)>
<!attlist telefax
opts cdata "null"
length cdata "2">
The notes element is a new top-level document "style", like
articles, books and letters. It is useful for miscellaneous purposes,
such as jotting down notes to oneself, where the complex structure of
the other styles is unnecessary. Notes here simply a sequence of
section paragraphs (i.e. paragraphs, lists, comments, long quotations,
figures, tables, displayed mathematical formulas, and program code).
An optional title is also available. The contents of a notes document
can be copied and pasted into a section or chapter of a book or article.
<!element notes - - (title?, p+) >
<!attlist notes
opts cdata "null" >
The manpage element is for Unix manual pages. Here we see again
an advantage of SGML. Using this element, the very same manual
page can be viewed on just about every terminal, using nroff,
or be included as a section of an article, report or book to be
formatted by TeX.
<!element manpage - - (sect1*)
-(sect2 | f | %mathpar | figure | tabular |
table | %xref | %thrm )>
<!attlist manpage
opts cdata "null"
title cdata ""
sectnum cdata "1" >
A manpage consists of a sequence of sections. There are two SGML
attributes, for the command name and manual section number,
respectively. Each section of the manual page is delimited by a
sect1 element. Notice that these sections may not contain
further subsections. Sections are represented as sect1
elements, rather than sect, to allow the manual page to be
easily cut and pasted into a sect section of an article, report
or book. (Of course, if the manual page is to be used a chapter of a
book, then these sections of the manual page will need to be replaced
with sect elements.)
Notice that Many elements, such as tables, figures and
mathematical formulas, cannot be used within manual pages, because of
limitations of ASCII terminals, or the Unix man macro package for
nroff.
There is a short reference map in effect within the scope of the
manpage. With the exception of [, which is not used here
to start formulas, this map has the same effect as the
global map.
<!shortref manpage
"&#RS;B" null
'"' qtag
"[" ftag
"~" nbsp
"_" lowbar
"#" num
"%" percnt
"^" circ
"{" lcub
"}" rcub
"|" verbar>
<!usemap manpage manpage >
For detailed information about the conventions for Unix manual pages, see your Unix documentation. But here is a brief summary. The typical manual page has the following sections, in this order:
The name, or list of names, by which the command or function is called, followed by a dash and then a one-line summary of its purpose.
For the syntax of the command and its arguments. (The Sun
documentation suggests that literals be formatted using boldface type,
and that variables be formatted using italics type. Use the tt
and em elements, respectively, here for this purpose.)
An overview of the command or function's purpose, effects and use.
A list and description of all command-line options.
A list of files associated with the command which may be of interest to users.
A comma-separated list of related Unix commands, and references to other relevant publications.
A list and explanation of any diagnositic messages the command may write to the standard error output file.
A description of any known bugs, problems, or limitations.
Some of you may be asking yourselves why manpage wasn't
designed so that each of these conventional sections of a manual page
is represented by its own SGML element. That certainly would have
been possible, but on the other hand the approach taken has the
advantage that users can simply cut and paste sections between manual
pages and article, reports and books. Of course it would have been
easy to write a filter to convert between these formats, but it was
felt that the benefits of a special manpage format would be too
small to warrant even this limited effort. After all, unless one is
using an SGML structure editor, users must refer to the SGML document
type definition to know what is expected in the manual page. It is
just as easy to check this documentation to see what sections
conventionally appear in manual pages. There is also a file which can
be used as a template or form for writing manual pages. See the Unix
Commands chapter for details.
The only reason there is a manpage document type, instead
of just another translation of, say, the article document type
into nroff is that the man macros used for the Unix
documentation are not powerful enough to format all of the features
available in our latex document type. Having this separate
manpage document type provides a means of checking whether the
manual page can be formatted by nroff using these man
macros. Again, as this document type is designed to be a subset of
the latex document type, the sections of a manual page can also
be included within instances of the latex document type.
Here is how the manual page for the cd command could have
been typed using this document type definition:
<manpage title="CD">
<sect1> NAME
<p>cd &mdash change working directory
<sect1> SYNOPSIS
<p> cd [ <em>directory</> ]
<sect1> DESCRIPTION
<p> <em>directory</> becomes the new working directory. The process
must have execute (search) permission in <em>directory</>. If cd is
used without arguments, it returns you to your login directory.
...
<sect1> SEE ALSO
<p> csh(1), pwd(1), sh(1)
</manpage>
This is the end of the qwertz document type definition.
<!-- end of qwertz dtd -->