Previous Next Contents

2. Very Loose and Very Basic SGML (Not A GoodBook)

An SGML document instance is a fully parenthesized or strictly hierarchically structured document. Components called elements are tagged at the beginning and end. Inside each element is a (perhaps empty) string of content and elements.

A Document Type Definition (DTD- the ADT of this language) describes the syntax of a document by declarations of elements, entities and notations. Elements and notations may have attributes with limited sets of value types. Each element is defined by a content model which describes a production (non-deterministic finite state automata -- no requirement for look-ahead) of the allowable patterns of constituent elements. A valid SGML document instance conforms to the grammar of a DTD.

A document type declaration takes the form:

<!DOCTYPE document_type_name optional_external_identifier [optional_declaration_subset] >

A notation declaration takes the form:

<!NOTATION notation_name notation_identifier >

An element type declaration takes the form:

<!ELEMENT name_of_element tag_minimization_indicators content_model_or_declared_content content_exceptions >

An element declaration may be followed by an attribute declaration:

<!ATTRIBUTE name_of_associated element(s) name of attribute allowable_values default name of attribute allowable_values default etc. >

An attribute list may be associated with many elements. An element may be associated with one and only one attribute list.

In a document instance, an element and its attributes look like this:

<MyName myatt="***" my2att="888">content, other markup, etc.</myname>

There are lots and lots of rules that modify that look. Get a GoodBook.

Element type declarations are modified by data types, keywords and element content expressions. Here again are loose examples.

2.1 Some data types for declared content

2.2 Some keywords are

2.3 Element content expressions are

Occurrence Indicators -- modify group or individual element. The absence of one of these specifies a default in which the element or group must occur once and only once.

Ordering connectors -- used in a group to specify order. Only one type may occur in a group.

2.4 Others

2.5 Minimization: Rules for omitting or including begin and end tags

2.6 Here is a brain-dead example of a DTD and document instance

<!SGML -- declaration goes here.  You know it's name. Look up the number. -- >
<!DOCTYPE badbook PUBLIC "-//Not a GoodBook//DTD BadBook//EN"
[
<!ELEMENT badbook - o (front, section+, appendix+) +(footnote) >
<!ATTLIST    badbook
                id      ID      #REQUIRED
                ISBN     CDATA   #IMPLIED >
<!ELEMENT section - - (section | paragraph)* >
<!ATTLIST  section
                id      ID      #REQUIRED >
<!ELEMENT paragraph - o (#PCDATA) -(footnote) >
]>
<badbook id="mybook.doc" ISBN="09283333">
   <section id="firstsection">
     <section id="firstfirst" >
       <paragraph>Oh I wish I was an Oscar Mayer Wiener!
      </section>
   </section>
</badbook>

2.7 Note some points about the badbook

The actual characters used for these declarations and indicators are specified in the SGML Declaration file. The SGML Declaration precedes the DOCTYPE declaration and essentially declares the character sets used for the syntax of SGML, features supported, capacity requirements for a system that will process the SGML document, etc. Study this in a GoodBook as this part instructs the parser about the nitty-gritty of how the DOCTYPE to follow is to be interpreted.

2.8 Entities

Entities and entity management are the heart of SGML organization, especially for modular structuring. Two types of entities can be declared:

1. General entity

-- used as an include by reference to insert text and or markup into a document instance. Can be used to store such things as boilerplate text in an external file. Takes the declaration form:

<!ENTITY entityname "stuff to be stuffed into somewhere">

Used in the document instance, this form will include the stuff:

&entityname;

2. Parameter entity

-- only used to include markup within SGML markup declarations (for example, to modularize a DTD itself.). Takes the declaration form:

<!ENTITY % entityname "declaration to be stuffed into markup" >

Used in a declaration, this form will include the stuff:

%entityname;

There are sets of keywords that can be in an entity declaration that modify its meaning. Look them up in a GoodBook.


Previous Next Contents