<?xml version="1.0" encoding="utf-8"?>

<?xml-stylesheet type="text/css"
    href="../docbook.css"?>

<!--
<?xml-stylesheet type="text/css" alternate="yes"
    title="ORA DocBook lite"
    href="http://WWW.sabi.co.UK/style/docbook-lite.css"?>
<?xml-stylesheet type="text/css" alternate="yes"
    title="Brian Lalonde for full DocBook"
    href="http://WWW.sabi.co.UK/style/docbook-brian.css"?>
<?xml-stylesheet type="text/css" alternate="yes"
    title="Brian Lalonde for Simplified Docbook"
    href="http://WWW.sabi.co.UK/style/docbook-simple-brian.css"?>
-->

<!--
<?xml-stylesheet type="text/xsl" alternate="yes"
    title="XSLT to XHTML"
    href="http://WWW.sabi.co.UK/style/docbook.xsl"?>
-->

<!DOCTYPE book
  PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
  "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
  [
    <!-- Mozilla handles these canonical namespaces specially -->
    <!ENTITY svgns    "http://www.w3.org/2000/svg">
    <!ENTITY xlinkns  "http://www.w3.org/1999/xlink">
    <!ENTITY xhtmlns  "http://www.w3.org/1999/xhtml">
    <!ENTITY mathmlns "http://www.w3.org/1998/Math/MathML">

    <!-- ArborText, Inc., 1988-1995, v.4001 -->

    <!NOTATION drw SYSTEM "DRW">

    <!ENTITY markups SYSTEM "markups.eps" NDATA EPS>
    <!ENTITY generic SYSTEM "generic.eps" NDATA EPS>
    <!ENTITY sgmlexa SYSTEM "sgmlexa.drw" NDATA DRW>
    <!ENTITY atilogo SYSTEM "atilogo.gif" NDATA GIF>
    <!ENTITY gloss SYSTEM "gloss.sgml">
    <!ENTITY www "World Wide Web">

    <!-- Mozilla comes with two lists of entities from XHTML 1.1 and
     MathML 2.0, and they must be explicitly included with relative
     system identifiers. -->
    <!ENTITY % MozillaXHTML11 SYSTEM "xhtml11.dtd"> %MozillaXHTML11;
    <!ENTITY % MozillaMathML  SYSTEM "mathml.dtd">  %MozillaMathML;
]
>

<book>
  <bookinfo>
    <title>Getting started with SGML</title>

    <subtitle>A guide to the Standard Generalized Markup Language and
      its role in information management</subtitle>

    <authorgroup><corpauthor>ArborText, Inc.</corpauthor>
      <othercredit><authorblurb>
	  <para>Ann Arbor, Michigan</para></authorblurb></othercredit>
    </authorgroup>

    <pubdate>18 October 1995</pubdate>

    <copyright>
      <year>1992, 1995</year>
      <holder>Arbortext, Inc.</holder>
    </copyright>

    <legalnotice>
      <para>This file may be redistributed electronically as long as
	it remains wholly intact, including this notice and copyright.
	This file must not be redistributed in hard-copy form.
	Arbortext will freely distribute this document in its original
	published form on request.</para>
    </legalnotice>

    <abstract>
      <para>As the world standard for textual information, SGML has
	gained prominence in many industries. Hundreds of companies have
	adopted SGML and thousands are considering it. If your
	organization produces a high volume of technical or business
	information of significant value, and if that information lends
	itself to a regular structure, then SGML probably offers
	significant benefits to you and your organization.</para>
      <para>This White Paper examines the factors that led to the
	development of SGML, the basic knowledge you need to understand
	SGML, the reasons for adopting SGML, lists those industries
	where SGML use is already widespread, and lists resources for
	more information and training.</para>
    </abstract>
  </bookinfo>

  <chapter>
    <title>The Business Challenge</title>

    <para>The explosive success of the Internet is an obvious example of
      an information revolution that's well under way. Companies that
      realize the tremendous cost and value of information management
      are reengineering their processes for creating, distributing and
      accessing information. The opportunities in each of these areas
      can be enormous:</para>

    <sect1>
      <title>Information Creation</title>

      <para>By some estimates, 20% of our GNP is spent on generating
	new information.  And over 90% of that information is in
	documents, not databases. When was the last time you took a
	close look at how much your organization invests in the creation
	of information?</para>
      <para>In conventional word processing and desktop publishing
	systems, your authors spend up to 30% of their time searching
	for information, and another 30% of their time applying styles
	and squeezing paragraphs so that each printed page looks
	nice. Plus, nearly every 18 months, technology changes
	completely, so you're continually paying for data conversions as
	software and hardware become obsolete.</para>
    </sect1>

    <sect1>
      <title>Information Distribution</title>

      <para>A few years ago, you could provide your information on paper
	alone.  Then CD-ROM technology became low-cost and widespread,
	so you've either already faced or soon expect to face the
	massive re-publishing effort needed to make all your information
	available electronically. And in just the last year, the &www;
	has thundered out of nowhere, creating yet another new format
	for your information.</para>
      <para>At the same time, your customers want your information
	tuned to their needs: they don't want to wade through huge
	technical manuals that describe all system variations and all
	possible uses for all possible users; they want information
	tailored to their own needs, so they can get to it and use it
	fast.</para>
    </sect1>

    <sect1>
      <title>Information Access</title>

      <para>In the U.S. alone, businesses produce 92 billion documents
	every year; and that number is skyrocketing. Can your
	people easily access the information you create in your own
	company? How about the information you receive from other
	companies?</para>
      <para>An organization's future can depend on how effectively
	it identifies, manages, and uses its information. The latest
	thinking in information management takes an enterprise-wide
	approach to the creation, distribution and maintenance of
	information. Organizations that have taken this broad view have
	realized enormous improvements in the cost, accuracy,
	timeliness, accessibility, and variety of the information they
	create and use.</para>
      <para>As part of this movement, companies in some industries
	are joining together to develop standards for exchanging
	information with each other and with their customers. Companies
	that keep up-to-date with these standards will be able to do
	business more efficiently and compete more effectively in global
	markets.  This white paper describes how one such standard, the
	Standard Generalized Markup Language (SGML), works as part of an
	overall information management strategy.</para>
    </sect1>
  </chapter>

  <chapter>
    <title>Unleashing the Power of Information</title>

    <para>Traditional documents and the methods for handling them
      suffer many limitations. The printed document is often the result
      of a sophisticated information process. Once it's printed,
      however, the document represents a dead-end in the information
      flow because it has no link to the electronic information
      base.</para>
    <para>Raw data may start in the form of technical specifications or
      engineering data. This information must be gathered, sorted,
      organized, and then manually assembled into hard copy
      documents. With each step in the documentation process, the
      information may have changed by mistake. The further removed the
      result is from the original source of information, the greater the
      risk of erroneous data. The problem can become so large that a
      majority of documents go out of date as soon as they are
      printed.</para>
    <para>A systematic approach to information management treats text
      and graphics as part of an organization's electronic information
      base. This gives everyone access to the information. By taking a
      broad view of the information creation and delivery process, you
      can see documents as any composition of information; the
      output from a database query, a printed document, an on-line
      diagnostic manual, an illustrated parts catalog, a collection of
      video clips, or a home page on the Internet's &www;.</para>
    <para>SGML allows you to manage information as data objects instead
      of characters on a page. Rather than a stream of indistinguishable
      bits and bytes, the data is <quote>chunked</quote> into
      identifiable discrete elements of information.  This technology
      enables you to store and reuse the information efficiently, share
      it with many users, and maintain it in a database.</para>
  </chapter>

  <chapter>
    <title>Getting to Know SGML</title>

    <para>This white paper provides an introduction to existing SGML
      technology, its advantages and benefits, as well as an overview of
      some related standards and how they fit into an overall approach
      to managing information. We also define some of the terminology
      and acronyms to familiarize you with the language associated with
      SGML. While SGML is a fairly recent technology, the use of <quote>
	markup</quote> in computer-based documents has existed for a
	while.
      Let's first look at earlier markup schemes that led to
      SGML.</para>

    <sect1>
      <title>What is markup?</title>

      <para>Markup is everything in a document that is not content.
	Markup originally referred to the handwritten notations that a
	designer would add to typewritten text; these notations
	contained instructions to a typesetter about how to lay out the
	copy and what typeface to use. This kind of markup is known as
	<firstterm>
	  procedural markup</firstterm>.</para>

      <sect2>
	<title>Procedural markup</title>
	<graphic entityref="markups">
	</graphic>

	<para>Most electronic publishing systems today, such as word
	  processing software and desktop publishing software, use
	  procedural markup. Procedural markup is typically unique to a
	  specific software package such as <trademark>Microsoft
	  </trademark> Word and <trademark>Quark
	  XPress</trademark>. Each has its own set of markup codes that
	  make sense only to itself. This markup usually takes the form
	  of formatting codes that are mixed in with the text of the
	  document.  Procedural markup codes apply to a single way of
	  presenting the information, such as a printed page, and
	  provide no capability to define appearance for other media,
	  such as CD-ROM and Internet.</para>
      </sect2>

      <sect2>
	<title>Descriptive markup</title>
	<graphic entityref="generic">
	</graphic>

	<para>Descriptive markup, also known as <quote>generic
	    markup</quote>,
	  describes the purpose of the text in a document, rather than
	  its physical appearance on the page. The basic concept of
	  descriptive markup is that the content of a document should
	  remain separate from its style. Descriptive markup is based on
	  the <firstterm>structure</firstterm> or
	  <firstterm>content</firstterm> of a document and identifies
	  elements accordingly; such as a chapter, a section, or a
	  table of contents; using notations that describe what the
	  element is, not how it appears. By separating presentation
	  information (<foreignphrase>i.e.</foreignphrase>, style) from
	  the structure and content, descriptive markup allows for
	  multiple presentations of the same information.  For example,
	  you can publish on paper, on-line, on CD-ROM and on the &www;
	  (Internet), all from the same set of source files with
	  descriptive markup.
	</para>
      </sect2>

      <sect2>
	<title>Drawbacks of procedural markup</title>

	<para>Producers of technical documentation increasingly prefer
	  descriptive markup over procedural markup. Procedural markup
	  is tedious and expensive; authors can spend 15% to 50% of
	  their time on the appearance of each page.  If style
	  guidelines change, or if you need to present the same
	  information in a different format, massive re-formatting is
	  usually required. When a company changes software or hardware
	  systems, enormous data translation tasks arise, often
	  resulting in errors. Because procedural markup is tied to one
	  final printed product, you cannot change formats
	  easily. Interchanging documents based on procedural markup
	  works easily only if both parties have the same hardware and
	  software system.</para>
      </sect2>
    </sect1>

    <sect1>
      <title>What is SGML?</title>

      <para>The Standard Generalized Markup Language, or SGML, is an
	international standard (ISO 8879) published in 1986. SGML
	prescribes a standard format for embedding descriptive markup
	within a document. More importantly, and crucial to its real
	value and power, SGML also specifies a standard method for
	describing the structure of a document.</para>
      <para>In other words, SGML allows you to set up structural rules
	for each type of document you produce. SGML ensures that each
	element, which is labeled with descriptive markup such as
	<quote>chapter</quote>, <quote>title</quote>, and
	<quote>paragraph</quote>, fits in the logical, predictable
	structure of your document type.</para>
      <para>SGML supports an infinite variety of document structures.
	Users typically create a different document structure for each
	category of information they produce: information bulletins,
	technical manuals, parts catalogs, design specifications,
	reports, letters and memos.</para>
      <para>SGML allows you to create documents that are independent
	of any specific hardware or software. Since SGML documents
	conform to an international standard, they are portable. You can
	exchange them seamlessly with users who have different
	systems.</para>
      <para>The world of photography demonstrates the power of
	standards: SGML is to documents as standardized film speed is to
	cameras. Today you can purchase a roll of film marked <quote>ISO
	100</quote>, put the film in your camera, set the camera's film
	speed to 100 (which many cameras do automatically), and you're
	ready to shoot. You don't have to worry that the brand of film
	is not compatible with your particular make of camera. The film
	and camera manufacturing industries; through the
	International Organization for Standardization (ISO) and
	American Standards Association (ASA); have agreed on
	standards for film speeds. Many industries plan to use SGML so
	that their documents work as easily on different computers as
	film works in different cameras.</para>
    </sect1>

    <sect1>
      <title>How does SGML work?</title>

      <para>To understand SGML we must look at the three layers of a
	typical document: structure, content, and style. SGML separates
	these three aspects, but deals mainly with the relationship
	between structure and content.</para>

      <sect2>
	<title>Structure</title>

	<para>At the heart of an SGML application is a file called the
	  <firstterm>DTD</firstterm>, or <firstterm>Document Type
	    Definition</firstterm>.
	  The DTD sets up the structure of a document, much like a
	  database schema describes the types of information it
	  handles. A DTD provides a framework for the types of elements
	  (such as chapters and chapter headings, sections, and topics)
	  that constitute a document.</para>
	<para>A DTD also specifies rules for the relationships between
	  elements; for example,

	  <quote>a chapter heading must be the first element after the
	    start of a chapter</quote>; or <quote>each list must contain
	    at least two items.
	  </quote>

	  These rules, which the DTD defines, help ensure that documents
	  have a consistent, logical structure. A DTD accompanies an
	  SGML document wherever it goes. A
	  <quote>document instance</quote> is a document whose content
	  has been tagged in conformance with a particular DTD.</para>
      </sect2>

      <sect2>
	<title>Content</title>

	<para>Content is the information itself: content includes
	  titles, paragraphs, lists, tables, graphics, and audio. The
	  method for identifying the content's position within the DTD
	  structure is called <quote>tagging.</quote> Creating an SGML
	  document involves inserting tags around content. These tags
	  mark the beginning and end of each part of the structure and
	  identify the type of contents they enclose. In the following
	  example, <sgmltag class="starttag">par</sgmltag> indicates the
	  start of a paragraph, and <sgmltag class="endtag">par</sgmltag>

	  indicates the end of the
	  paragraph:

	  <programlisting>&lt;par>Paragraphcontent.&lt;/par></programlisting></para>

	<para>You can nest elements within other elements; in the
	  following example, the paragraph
	  (<sgmltag class="element">par</sgmltag>)
	  is an element within the topic
	  (<sgmltag class="element">topic</sgmltag>):
	  <programlisting>&lt;topic>&lt;par>Content.&lt;/par>&lt;/topic></programlisting></para>
	<para>The structure of a particular document is revealed
	  by the nesting of tags:
	  <programlisting>&lt;section>&lt;subhead>Content&lt;/subhead>
	    &lt;par>Content is the information 
	    itself.&lt;/par>&lt;/section></programlisting></para>
	<para>Fortunately, human beings usually don't have to deal
	  with manually typing in tags and checking to make sure all the
	  tags are there. Some SGML-based authoring software programs
	  make it easy to enter tags by clicking on pull-down menus that
	  guide you by listing only those tags that are valid at the
	  cursor's current position in the document. These programs rely
	  on a software module called a <quote>parser</quote> that
	  verifies that the document follows the rules of the DTD. (The
	  parser also verifies that the DTD itself is structurally
	  correct.) The following illustration shows how an SGML-based
	  authoring program would display the tags for the previous
	  ASCII example:</para>
	<graphic entityref="sgmlexa">
	</graphic>
      </sect2>

      <sect2>
	<title>Style</title>

	<para>SGML itself has nothing to do with setting standards for
	  style, so most systems still rely on proprietary methods of
	  setting style. It is the style that determines the final
	  appearance of the document information. Some efforts are being
	  made to develop standards-based style sheets; two of these
	  efforts have resulted in the mature OS standard and the still
	  unreleased DSSSL standard.
	</para>
	<para>The U.S. Department of Defense CALS initiative developed
	  its own style standard, known as the Output Specification
	  (OS). The OS is in the form of a particular DTD that allows
	  the user to create a Formatting Output Specification Instance,
	  or FOSI (usually pronounced <quote>fossy</quote>), that is
	  well suited to both print and electronic output.</para>
	<para>A FOSI is essentially a powerful style sheet that
	  specifies the formatting for each tag in a DTD. With the FOSI,
	  the document, and the DTD, you have a complete interchange
	  package for printed documents that maintains its format and
	  style as it is interchanged among systems. In early 1995, an
	  ISO committee released a draft of the Document Style Semantics
	  and Specification Language (DSSSL), which is on its way to
	  becoming an international standard for presenting SGML-based
	  documents. Official release is expected later this
	  year.</para>
	<para>The complete DSSSL standard covers a broad scope, so
	  subsets are being developed to handle varying levels of
	  functionality. A subset whose functionality is approximately
	  equivalent to FOSIs is expected, and work on tools to convert
	  FOSIs to and from DSSSL is under way.</para>
	<para>Many military contracts currently require FOSIs, and
	  many non-defense firms have also embraced the Department of
	  Defense's OS standard because it's a mature and supported
	  standard. It is expected that both DSSSL and FOSIs will remain
	  important standards for the foreseeable future.</para>
      </sect2>
    </sect1>
  </chapter>

  <chapter>
    <title>What Does SGML Give Me?</title>

    <para>SGML has become mainstream technology that you can use with
      confidence.  Your adoption of SGML will allow your organization to
      gain the maximum value from your generation and use of
      information:</para>

    <sect1>
      <title>Increased productivity</title>

      <para>A structured approach to documents helps writers organize
	the information as they are creating it, and keeps content
	separate from style. This separation enables you to set up
	centrally-controlled style guidelines, so authors can focus on
	generating the content rather than adjusting each document's
	appearance.  That change alone can as much as double your
	authors' productivity.</para>
      <para>You can also improve efficiency by keeping a central
	information base so that authors don't have to recreate the same
	information in order to use it. This also ensures that the most
	current information is made available to all. And, a single
	update to the information base ensures that all documents
	created from that information base will automatically be
	updated.</para>
    </sect1>

    <sect1>
      <title>Reusability</title>

      <para>A printed document is just one of many possible products
	from SGML-based information. For example, a technical
	publications group can use tags to identify a procedure as a
	sequence of steps. In this case, you identify the beginning and
	end of the procedure, and each step within the procedure. The
	same procedure can now appear in several forms: maintenance and
	operational manuals, on-line technical manuals, training guides,
	etc. More importantly, since the tags are machine-readable, the
	computer can manage and maintain the many different uses of the
	same single source of information, so no re-keying is required
	to produce this information in new document formats.</para>
    </sect1>

    <sect1>
      <title>Information longevity</title>

      <para>SGML is a simple, standard file format with an indefinite
	shelf life; you'll never again have to convert your documents
	when a hardware or software system becomes obsolete. Once you
	setup your SGML information base, the information will always be
	available, because it carries everything needed to create a
	document. So even when your hardware or software becomes
	obsolete, your information remains usable, portable, and
	available.</para>
    </sect1>

    <sect1>
      <title>Improved data integrity</title>

      <para>Defining a document's structure helps ensure that the right
	information is in the right place, which improves the
	organization of your information.  Because SGML eliminates the
	need for data conversion when it passes across systems, you
	reduce the risk of losing information by filtering data from one
	format to another.</para>
    </sect1>

    <sect1>
      <title>Better data control</title>

      <para>With SGML, you can define and manipulate information
	elements at any level of detail. A tagged element can have
	attributes that provide characteristics or properties about the
	element. This attribute information is useful for managing and
	manipulating the information elements. For example, an ID
	(identifier) attribute can uniquely identify a single paragraph,
	a whole section, a legal notice, an illustration, a task, or any
	element that you may want to use repeatedly.  The following
	example shows a paragraph with an ID attribute:
	<programlisting>&lt;para id=431>Content.&lt;/para></programlisting></para>
      <para>By simply referencing the ID, you can include this
	information into your document in as many places as you
	need. This eliminates re-typing and ensures that the information
	is identical in every instance.</para>
      <para>Plus, the IDs you set are machine readable so that the
	computer can find and link related information. This allows you
	to use IDs for a variety of information management
	controls. These controls can help you:

	<itemizedlist>
	  <listitem><para>Manage the security of information by allowing
	      only certain people to view or change information with
	      selected IDs.</para>
	  </listitem>
	  <listitem><para>Automate the information flow; for
	      example, updating the data in one place can trigger the
	      update of the same information in other places within the
	      same document and in other documents.</para>
	  </listitem>
	</itemizedlist></para>
    </sect1>

    <sect1>
      <title>Shareability</title>

      <para>Since SGML is aware of the individual components of a
	document, you can easily build entirely new documents out of
	existing information. This capability enables users to share the
	latest information without duplicating it. An example of this
	might be a standard legal notice or copyright statement
	appearing in documents throughout a company. The legal
	department maintains this module of information, updating it on
	occasion. A single tag in your document can pull in the current
	legal notice each time you access or output your document,
	eliminating needless duplication of information and ensuring the
	accuracy of your information.</para>
    </sect1>

    <sect1>
      <title>Portability of information</title>

      <para>Today, information networks proliferate where different
	computers, operating systems, and applications must share
	information. In these sort of networks, portability becomes the
	key in making sure all who need it can access the
	information. Thanks to the hardware and software independence of
	SGML, you can easily exchange SGML documents among different
	environments.</para>
    </sect1>

    <sect1>
      <title>Flexibility beyond traditional publishing</title>

      <para>The information you create today may be used a year from now
      in ways you haven't yet anticipated. Just last year, the need to
      publish on the &www; did not even exist! The spectacular growth of
      the Web serves as dramatic proof that we simply cannot anticipate
      all the purposes for which our information may eventually be
      used.</para> <para>SGML permits you to use your information for
      applications beyond traditional publishing. For
      example:

	<itemizedlist>

	  <listitem><para>&www; pages</para></listitem>
	  <listitem><para>information databases</para></listitem>
	  <listitem><para>diagnostic/expert systems</para></listitem>
	  <listitem><para>electronic mail</para></listitem>
	  <listitem><para>hypermedia and hypertext documents</para></listitem>
	  <listitem><para>database publishing</para></listitem>
	  <listitem><para>CD-ROM publishing</para></listitem>
	  <listitem><para>Interactive Electronic Technical Manuals (IETMs)</para>
	  </listitem>
	  <listitem><para>electronic review</para></listitem>
	</itemizedlist></para>
    </sect1>
  </chapter>

  <chapter>
    <title>Is SGML Right for Me?</title>

    <para>In the life cycle of a product, the cost of gathering,
      producing, and maintaining the necessary technical information can
      exceed the initial hardware cost. For many industries, technical
      information is part of a deliverable product, or a product in
      itself. Any industry whose product line is heavily dependent on
      information can benefit from SGML.</para>
    <para>In evaluating how SGML can help your organization, you may
      wish to consider some strategic business issues to help in your
      information management plan.  A strategic approach should prompt
      you to examine your current information needs and your current
      document management methodology. Some questions to consider
      include:

      <itemizedlist>
	<listitem><para>Does your information require a long life-span?
	    (For example, technical information related to airplanes
	    often needs to be maintained for over 20 years.)</para>
	</listitem>
	<listitem><para>Do you need to exchange documents across mixed
	    hardware environments?</para>
	</listitem>
	<listitem><para>Do you need to produce large documents with a
	    disciplined structure?</para>
	</listitem>
	<listitem><para>Do your documents contain information common to
	    other documents within a department, across corporate
	    divisions, or even across separate organizations?
	  </para>
	</listitem>
	<listitem><para>Do you have information that's used for
	    different purposes?  (For example, a part number may appear
	    in a maintenance manual as well as a parts inventory
	    database.)</para>
	</listitem>
	<listitem><para>Does your information change frequently and
	    get used often?</para>
	</listitem>
	<listitem><para>Do you produce information that needs to comply
	    to industry or company guidelines?</para>
	</listitem>
      </itemizedlist></para>
    <para>By examining your requirements, you can evaluate how SGML fits
      into your information management strategy. Standardizing on SGML
      doesn't mean you need to use it for all documents; SGML is most
      useful for documents with a definable structure. Since SGML
      handles documents as collections of distinguishable data elements,
      it is useful to think in terms of modules of information, rather
      than complete printed documents.</para>
    <para>SGML is most useful as a tool in an integrated information
      management strategy. Making such a strategic choice and planning
      the implementation should be decided by a company's high-level
      management. There will be initial implementation costs in moving
      to SGML. But the payback comes from benefits that accrue over time
      and enhance your investment in information. Any organization that
      exchanges information between systems, applications, departments,
      and companies will realize these benefits.</para>
  </chapter>

  <chapter>
    <title>What Is a Good SGML System?</title>

    <para>By design, SGML applications are meant to be customized.
      Just as there's no out-of-box database application that can serve
      all the needs of an organization, there are no one-size-fits-all
      SGML application. Since each organization's information
      requirements are different, there are many DTDs. More
      organizations are also looking at industry-wide information needs
      and developing standards for handling that information.</para>
    <para>A number of products on the market handle SGML to some
      degree. But not all products handle all the features of the SGML
      standard. The sections that follow describe some basic
      requirements.</para>

    <sect1>
      <title>Provides real-time interactive parsing</title>
      <para>An invaluable feature in an SGML system is real-time,
	interactive SGML validation. This feature allows the software to
	provide context-sensitive editing assistance based on the
	cursor's current position in the document.  For example, if the
	cursor is immediately after the beginning tag for a section, and
	all sections must have a section heading, the software allows
	you to insert only a section heading tag. This feature ensures
	that the author does the correct tagging at all times which
	ensures that the author creates a valid SGML document the first
	time.</para>
      <para>By contrast, systems that use batch parsing allow authors
	to insert tags and text without checking each action against the
	DTD. In this approach, authors create documents in one format,
	then filter parts of the document into SGML, and then run the
	SGML through a validating parser. When the parser finds errors,
	the author must correct the original document, then filter and
	parse the changes again. The author must repeat this cycle until
	the entire document parses successfully. This approach adds
	steps to the publishing process that add no value. Time saved by
	authoring in a familiar format is lost in the filtering and
	validating process. A system that creates native SGML
	information eliminates the costly, time-consuming, and often
	error-prone process of retrofitting documents into valid
	SGML.</para>
    </sect1>

    <sect1>
      <title>Uses real SGML</title>

      <para>If your authoring software merely produces SGML as output,
	then your information is still tied to a proprietary format, and
	still at the mercy of software and hardware obsolescence. A
	publishing system that uses SGML as its native file format
	allows your information to remain accessible and usable
	regardless of hardware and software changes. If you need your
	information to remain accessible as you grow into new systems
	and new technologies then using a native SGML file format
	provides a distinct advantage over a system that filters the
	data into SGML. Here's an acid test to identify a real SGML
	system: can the software accept any SGML document, display that
	document, and then save that document, leaving it
	unchanged?</para>
    </sect1>

    <sect1>
      <title>Supports any DTD</title>

      <para>To be fully usable, a good SGML product allows you to create
	a variety of new document types in addition to accepting
	existing DTDs used in some industries. This feature is sometimes
	called the ability to handle

	<firstterm>arbitrary</firstterm> or user-defined DTDs. With
	arbitrary DTDs you are free to create any document type.</para>
    </sect1>

    <sect1>
      <title>Supports SGML features</title>

      <para>The developers of SGML built into the standard a number
	of features that facilitate automated publishing and document
	reuse. A fully-featured SGML publishing package should support
	this functionality. Some of the basic features to look for
	include:

	<itemizedlist>
	  <listitem><para><firstterm>Marked sections.</firstterm> Marked
	      sections let you create multiple versions from a single
	      master document using regions of conditional text that
	      only appear in specified versions. For example, you might
	      want to build a single source document that describes two
	      variations of your product. You simply write the source
	      document with marked sections for the areas that
	      differ. The system can then identify these areas and
	      produce two different versions of your information from
	      the same source file.</para>
	  </listitem>

	  <listitem><para><firstterm>External file entities</firstterm>.
	      A file entity is simply a pointer to a separate document
	      file. You can use file entities to break a large document
	      into subdocuments. You can also use a file entity to
	      reference frequently repeated boilerplate information such
	      as an electrical caution.</para>
	  </listitem>
	  <listitem><para><firstterm>Graphic entities</firstterm>.
	      A graphic entity is a pointer to a separate graphic
	      file.</para>
	  </listitem>
	  <listitem><para><firstterm>Text entities</firstterm>.
	      A text entity is a single tag that represents a common
	      phrase repeated throughout a document. This allows you to
	      reference the tag instead of re-keying the phrase each
	      time you need to use it.</para>
	  </listitem>
	</itemizedlist></para>
    </sect1>
  </chapter>

  <chapter>
    <title>Who Uses SGML Now?</title>

    <para>Early in its history, the primary adopters of SGML were
      defense contractors. In the last two years, however, the trickle
      of commercial users has turned into a torrent. Many leading
      industrial groups recognize the benefits SGML offers and have
      adopted it for information management and exchange among their
      members, and between members and their vendors and
      customers.</para>

    <para>Several industries have developed standards for information
      exchange:</para>

    <variablelist>
      <varlistentry><term>AAP</term>
	<listitem>
	  <para>The American Association of Publishers developed The
	    American National Standard for Electronic Manuscript
	    Preparation and Markup, a general purpose book DTD for
	    publishers, authors and editors.</para>
	</listitem>
      </varlistentry>

      <varlistentry><term>ATA (airlines)</term>
	<listitem>
	  <para>The Air Transport Association, a consortium representing
	    the commercial airline industry, developed several DTDs
	    under the ATA-100 specification.  The ATA's European
	    counterpart, AECMA, is also adopting standards based on
	    SGML.</para>
	</listitem>
      </varlistentry>

      <varlistentry><term>ATA (trucking)</term>
	<listitem>
	  <para>The Maintenance Council of the American Trucking
	    Association has initiated a task force with the mission of
	    <quote>Establishing the Standard for Electronic
	      Service Information</quote>.

	    This task force represents large truck manufacturers and
	    fleet operators interested in standardizing the interchange
	    of service information, and they are developing the T2008
	    DTD, modeled after the SAE's J2008 DTD for automobiles and
	    light trucks. The first release of the standard is expected
	    in 1996.</para>
	</listitem>
      </varlistentry>

      <varlistentry><term>DocBook</term>
	<listitem>
	  <para>Founded by ten major producers and consumers of
	    technical documentation for computer systems, the Davenport
	    Group has developed the DocBook DTD for exchanging and
	    delivering computer documentation. Founding members included
	    Novell, O'Reilly &amp; Associates, Fujitsu OSSI,
	    Hewlett-Packard, Digital Equipment Corporation, SCO, Hal
	    Computer Systems, Hitachi Computer Products, SunSoft and
	    Unisys.</para>
	</listitem>
      </varlistentry>

      <varlistentry><term>DoD</term>
	<listitem>
	  <para>The U.S. Department of Defense created the Continuous
	    Acquisition and Life-Cycle Support (CALS) initiative
	    (recently renamed from Computer-aided Acquisition and
	    Logistic Support). The next section describes CALS in more
	    detail.</para>
	</listitem>
      </varlistentry>

      <varlistentry><term>Pinnacles</term>
	<listitem>
	  <para>Led by Intel, National Semiconductor, Texas Instruments,
	    Phillips, and Hitachi, the Pinnacles Group is developing the
	    Pinnacles Component Information Standard (PCIS) to allow
	    reusability of component data by semiconductor customers and
	    vendors. This data can include descriptions, specifications,
	    physical diagrams, code fragments, behavior models, and
	    other text, tables, graphics, and technical data.</para>
	</listitem>
      </varlistentry>

      <varlistentry><term>SAE</term>
	<listitem>
	  <para>The Society of Automotive Engineers is developing the
	    J2008 DTD for electronic interchange of service and
	    diagnostic information. The J2008 Task Force is part of the
	    Vehicle Electronic/Electrical Systems Committee, whose
	    mission is to increase customer satisfaction and lower
	    product life cycle costs by recommending standards that
	    promote more effective diagnosis of vehicle systems. The DTD
	    is expected to be released for approval as a Technical Draft
	    Standard in 1995. After three years, it will be voted upon
	    again to determine if it should become a Recommended
	    Practice.</para>
	</listitem>
      </varlistentry>

      <varlistentry><term>TCIF</term>
	<listitem>
	  <para>The Telecommunications Industry Forum is an
	    international association of carriers and major vendors of
	    telecommunications products and services.  The TCIF
	    initiative is focused on the re-use of technical information
	    across multiple applications and different
	    environments.</para>
	</listitem>
      </varlistentry>
    </variablelist>

    <para>Many SGML applications are in commercial use. Other
      industries moving to SGML include pharmaceuticals, publishing, and
      manufacturing.</para>
    <para>Overseas, SGML is gaining wide acceptance. The European
      Airbus, a consortium of companies in the commercial airline
      industry in Europe, adopted SGML. Telecommunications, aerospace,
      manufacturing, and other commercial and military interests
      throughout Europe are also using SGML.</para>
  </chapter>

  <chapter>
    <title>What Is CALS?</title>
    <para>CALS stands for Continuous Acquisition and Life-Cycle Support
      (recently renamed from Computer-aided Acquisition and Logistic
      Support). It is a large-scale, long-term information management
      project initiated by the U.S. Department of Defense (DoD). Since
      the DoD receives goods and services from a wide range of
      suppliers, contractors and subcontractors, it constantly handles
      massive quantities of technical information. Today's weapon
      systems are technologically complex and can have a life span of 20
      years or more. As a result, the amount of technical data needed to
      support and maintain these systems is overwhelming.
    </para>

    <para>The CALS standards that apply to maintaining technical
      information include:

      <itemizedlist>
	<listitem><para>MIL-STD-1840: The Automated Interchange of
	    Technical Information: this is the umbrella standard
	    specifying overall guidelines for electronic data storage
	    and exchange of CALS documents on magnetic tape.</para>
	</listitem>

	<listitem><para>MIL-M-28001: SGML (Standard Generalized Markup
	    Language) for exchanging text.</para>
	</listitem>

	<listitem><para>MIL-D-28000 IGES (Initial Graphics Exchange
	    Specification) an object-oriented format for technical
	    drawings.</para>
	</listitem>

	<listitem><para>MIL-R-28002 CCITT Group 4 (International
	    Consultative Committee on Telephony and Telegraphy) for
	    raster images.</para>
	</listitem>

	<listitem><para>MIL-D-28003 CGM (Computer Graphics Metafile)
	    for object-oriented graphics.</para>
	</listitem>
      </itemizedlist></para>
  </chapter>

  <chapter>
    <title>Resources</title>

    <para>Here are a few resources for more information on SGML.</para>

    <sect1>
      <title>Conferences, tutorials, and training</title>

      <para>The Graphic Communications Association (GCA) was
	instrumental in the development of SGML. The GCA provides
	conferences, tutorials, newsletters, and publication sales for
	both members and non-members.<literallayout>Graphic
	Communications Association
	  100 Daingerfield Road
	  Alexandria, Virginia 22314-2804 USA

	  +1 703.519.8160</literallayout></para>

      <para>SGML Open is a non-profit, international consortium of
	providers of SGML products and services dedicated to
	accelerating the further adoption, application, and
	implementation of SGML.<literallayout>SGML Open
	  910 Beaver Grade Road, #3008
	  Coraopolis, Pennsylvania 15108 USA

	  +1 412.264.4258</literallayout></para>

      <para>ArborText also offers a range of introductory to advanced
	level SGML training courses, including DTD and FOSI
	training. For further information on ArborText's training
	services, schedules, and course descriptions, please contact
	ATI's Training Team at +1 313.996.3566.</para>

      <bridgehead>Books on SGML</bridgehead>

      <para><citation>SGML: An Author's Guide to the Standard
	  Generalized Markup Language</citation>, Martin Bryan,
	  Addison-Wesley, 1988, ISBN 0-201-17537-5
      </para>

      <para><citation>The SGML Handbook</citation>, Charles Goldfarb,
	Oxford University Press, 1990, ISBN
	0-19-863737-9</para>

      <para><citation>Practical SGML</citation>, Eric van Herwijnen,
	Kluwer Academic Publishers, 1994, ISBN
	0-7923-9434-8</para>
    </sect1>
  </chapter>

  <glossary>
    <title>Glossary</title>

    <glossentry><glossterm>ASCII</glossterm>
      <glossdef>
	<para>(American Standard Code for Information Interchange)
	  This standard character encoding scheme is used extensively in
	  data transmission.</para>
      </glossdef>
    </glossentry>

    <glossentry><glossterm>ANSI</glossterm>
      <glossdef>
	<para>(American National Standards Institute) This group is
	  the U.S. member organization that belongs to the ISO, the
	  International Organization for Standardization.
	</para>
      </glossdef>
    </glossentry>

    <glossentry><glossterm>attribute</glossterm>
      <glossdef>
	<para>An attribute provides more information about an
	  element such as classification level, unique reference
	  identifiers, or formatting information.</para>
      </glossdef>
    </glossentry>

    <glossentry><glossterm>CCITT Group 4</glossterm>
      <glossdef>
	<para>(International Consultative Committee on Telegraphy and
	  Telephony) This CALS standard for raster graphics incorporates
	  tiling, which divides a large image into smaller tiles. You
	  can exchange graphic files in CCITT/4 format in a compressed
	  state so they take up much less file space.</para>
      </glossdef>
    </glossentry>

    <glossentry><glossterm>CITIS</glossterm>
      <glossdef>
	<para>(Contractor Integrated Technical Information Service)
	  As part of CALS Phase II, CITIS is a draft functional
	  specification for services. DoD acquisition managers designed
	  CITIS as a plan to gain access to product-related digital
	  technical information.</para>
      </glossdef>
    </glossentry>

    <glossentry><glossterm>CGM</glossterm>
      <glossdef>
	<para>(Computer Graphics Metafile) CGM is one of the CALS
	  standard formats for representing 2-D technical
	  illustrations. CGM is an object-oriented graphic
	  format.</para>
      </glossdef>
    </glossentry>

    <glossentry><glossterm>DSSSL</glossterm>
      <glossdef>
	<para>(Document Style Semantics and Specification Language)
	  This draft international standard (DIS 10179) applies to the
	  specification of processing information for SGML
	  documents. DSSSL is expected to became an international
	  standard.
	</para>
      </glossdef>
    </glossentry>

    <glossentry><glossterm>DTD</glossterm>
      <glossdef>
	<para>(Document Type Definition) A DTD is the formal
	  definition of the elements, structures, and rules for marking
	  up a given type of SGML document. You can store a DTD at the
	  beginning of a document or externally in a separate file.
	</para>
      </glossdef>
    </glossentry>

    <glossentry><glossterm>EDI</glossterm>
      <glossdef>
	<para>(Electronic Data Interchange) This is a set of computer
	  interchange standards for business documents such as invoices,
	  bills, and purchase orders.
	</para>
      </glossdef>
    </glossentry>

    <glossentry><glossterm>element</glossterm>
      <glossdef>
	<para>An element is a piece of data within a document that may
	  contain either text or other subelements such as a paragraph,
	  a chapter, and so on.</para>
      </glossdef>
    </glossentry>

    <glossentry><glossterm>element declaration</glossterm>
      <glossdef>
	<para>A statement in the DTD defining an element and declaring
	  the order in which it may appear in the document and what
	  other elements it may include.
	</para>
      </glossdef>
    </glossentry>

    <glossentry><glossterm>entity</glossterm>
      <glossdef>
	<para>An entity is a self-contained piece of data that can be
	  referenced as a unit. You can refer to an entity by a symbolic
	  name in the DTD or the document.  An entity can be a string of
	  characters, a symbol character (unavailable on a standard
	  keyboard), a separate text file, or a separate graphic
	  file.</para>
      </glossdef>
    </glossentry>

    <glossentry><glossterm>entity declaration</glossterm>
      <glossdef>
	<para>A statement in the DTD or document that assigns an SGML
	  name to an entity so you can reference it.</para>
      </glossdef>
    </glossentry>

    <glossentry><glossterm>FOSI</glossterm>
      <glossdef>
	<para>(Formatting Output Specification Instance) A FOSI is used
	  for formatting SGML documents for printing and other
	  outputs. It is a separate file that contains formatting
	  information for each element in a document.</para>
      </glossdef>
    </glossentry>

    <glossentry><glossterm>HTML</glossterm>
      <glossdef>
	<para>(HyperText Markup Language) This is the format of files
	  published on the &www;. HTML is an application of SGML; to
	  author in HTML using SGML-based authoring software, you simply
	  need the HTML DTD.</para>
      </glossdef>
    </glossentry>

    <glossentry><glossterm>IGES</glossterm>
      <glossdef>
	<para>(Initial Graphics Exchange Specification) The IGES
	  standard for engineering, product design, and manufacturing
	  drawings is one of the CALS standard graphics formats.</para>
      </glossdef>
    </glossentry>

    <glossentry><glossterm>Internet</glossterm>
      <glossdef>
	<para>The Internet is a worldwide communications network
	  originally developed by the U.S. Department of Defense as a
	  distributed system with no single point of failure. The
	  Internet has seen an explosion in commercial use since the
	  development of easy-to-use software for accessing the
	  Internet.</para>
      </glossdef>
    </glossentry>

    <glossentry><glossterm>ISO</glossterm>
      <glossdef>
	<para>(International Organization for Standardization) The ISO
	  is an industry-supported organization that establishes
	  worldwide standards for everything from data interchange
	  formats to film speed specifications.</para>
      </glossdef>
    </glossentry>

    <glossentry><glossterm>markup</glossterm>
      <glossdef>
	<para>Markup is anything added to the content of the document
	  that describes the text.</para>
      </glossdef>
    </glossentry>

    <glossentry><glossterm>parser</glossterm>
      <glossdef>
	<para>A parser is a specialized software program that recognizes
	  SGML markup in a document. A parser that reads a DTD and
	  checks and reports on markup errors is a validating SGML
	  parser. A parser can be built into an SGML editor to prevent
	  incorrect tagging and to check whether a document contains all
	  the required elements.</para>
      </glossdef>
    </glossentry>

    <glossentry><glossterm>PDES/STEP</glossterm>
      <glossdef>
	<para>(Product Data Exchange Standard/Standard for the Exchange
	  of Product Model Data). PDES/STEP are standards under
	  development for communicating a complete product model with
	  sufficient information content that advanced CAD/CAM
	  applications can interpret. PDES is under development as a
	  national standard and STEP is under development as its
	  international counterpart.</para>
      </glossdef>
    </glossentry>

    <glossentry><glossterm>tag</glossterm>
      <glossdef>
	<para>In the world of SGML, a tag is a marker embedded in a
	  document that indicates the purpose or function of the
	  element. Each element has a beginning tag and an end
	  tag.</para>
      </glossdef>
    </glossentry>

    <glossentry><glossterm>&www;</glossterm>
      <glossdef>
	<para>Often referred to as WWW or the Web, this usually refers
	  to information available on the Internet that can be easily
	  accessed with software usually called a <quote>browser</quote>.

	  Organizations publish their information on the Web in a format
	  known as HTML; this information is usually referred to as
	  their <quote>home page</quote> or <quote>web
	  site</quote>.</para>
      </glossdef>
    </glossentry>
  </glossary>

  <script xmlns="&xhtmlns;" type="text/javascript"
    src="http://WWW.sabi.co.UK/style/docbook.js"></script>
</book>
