In 1991, the World Wide Web (WWW) went public. It was based on a simple assumption. Documents would be represented in a standard structural copymarking language (HyperText Markup Language -- HTML), employ a standard document identification scheme (Universal Resource Locator -- URL) and use a standard retrieval form (HyperText Transfer Protocol -- HTTP). A mere 5 years later, there is one document index site -- Alta Vista - - with a 30,000,000,000 byte index of more than 20,000,000 pages of text. Alta Vista indexes more than 2,500,000 new or revised pages every day and handles more than 5,000,0000 queries every day.
What makes the WWW work is the agreement to use a
standard form for documents and document references.
By some criteria, HTML, URL, and HTTP are too simple,
especially given the more comprehensive standards on
which they are based, i.e., SGML,
DSSSL,
Hytime
, and
CCL
. Even
with the simplicity, and maybe because of it, WWW
standards for document interchange have dramatically
changed the way we do business in the electronic realm.
These developments are encouraging in what they portend
as well as what they actualize. They point to the need
for more attention to be paid to document interchange
standards specifically and to data interchnage
standards more generally. Consider for example the
general platform dependence of audio formats, or the
issue of translating procedurally formatted
wordprocessor documents to HTML format. Related to the
WWW, there are two specific concerns: