Taxonomies and indexes are similar in that they both help
guide people to find desired information on a selected topic. While they could
be searched, they are designed specifically to be browsed. The obvious difference
is that taxonomies for end-users are arranged hierarchically (or by facets),
and indexes are arranged alphabetically. I have blogged previously on a comparison of index creation and taxonomy/thesaurus creation, but for those who
are not already skilled at creating one or the other, let’s step back and
further compare taxonomies and indexes themselves.
Taxonomy and Index Similarities and Differences
Taxonomies and indexes were developed for different kinds of
media. Modern taxonomies are designed to function well in online implementations
(through clicking on hyperlinks to narrower topics or plus signs to expand
hierarchical trees), although taxonomies have existed in print as well.
Indexes, specifically the back-of-the-book style, are designed to function well
in print (through scanning a large number of entries and subentries on a page),
although displayed indexes occasionally exist online as site A-Z indexes on
small, static websites. Hyperlinked indexes at the end of ebooks are also possible,
but the inadequate application of ebook standards have hindered such indexes
from becoming commonplace.
Taxonomies and indexes serve different kinds of content. Taxonomies
work well for content in a subject area that is easy or logical to categorize:
products or product types, industries, geographic areas, occupational areas, media
or document types, etc. Indexes work will for content on a subject area that is
more abstract and does not lend itself to hierarchical categories: management
concepts, history, news, etc. Indexes, since they are arranged alphabetically,
are also excellent for browsing names/proper nouns. Taxonomies work well for a
defined scope, such as collections of documents of the same type (all resumes,
all marketing materials, all legal documents, etc.). Indexes, on the other
hand, tend to serve better for content with a less defined scope, such as
general encyclopedic information or detailed user manuals. Not surprisingly,
book-like content continues to be best served by indexes.
The differences in structure are not as simple as taxonomies
being hierarchical and indexes being alphabetical. Taxonomies also have alphabetical
aspects, as terms at the same level of a hierarchy are typically (or by
default) arranged alphabetically. Indexes, meanwhile, also have hierarchical
aspects, as there are main entries with subentries under them. Some large indexes
even have a third level of sub-subentries. Then there are kinds of taxonomies,
called thesauri, which are structured more around terms and relationships than
hierarchical trees, and such thesauri may be arranged alphabetically. In fact,
the same thesaurus can be arranged both hierarchically or alphabetically, with
the click of a toggle button in a thesaurus management system. But re-sorting a
thesaurus alphabetically does not change it into an index. It will still lack
the subentry features of an index.
The defining difference between a taxonomy and an index is
that an index is not an index unless it is linked to content, as the word “index”
means “to indicate” or “to point,” as in to point to content. A taxonomy is
still a taxonomy whether or not it is linked to content. (But it is not really
useful, unless it is linked to content.)
Where Taxonomies and Indexes Meet
In addition to back-of-the-book indexes, there also exist
periodical article indexes, such as the green-bound printed volumes of the
Reader’s Guide to Periodical Literature and subsequent online periodical and
reference databases accessed through libraries (InfoTrac, ProQuest, EBSCOhost,
etc.) What happens is that indexers index the articles with terms from the
taxonomy (or thesaurus or controlled vocabulary). The result of the indexing,
an alphabetical arrangement of taxonomy terms that were used in the indexing
with their links to content, constitutes an index. So, the index comprises
terms in the taxonomy that are linked to content and arranged alphabetically.
Displayed browsable alphabetical indexes, however, have become less common in
online services, as they have been replaced by features that search on the index terms instead.
The trend toward “multi-channel publishing” means that the
same original content may appear in different formats and media, such as print
and online. Online, however, may mean more than just a PDF or other ebook
format of the printed version. Rather, digital text content gets chunked into
units of the size or length that could be indexed as a whole with taxonomy terms, and images
and new multimedia exist as separate assets that can also be indexed with taxonomy
terms. What this means is that a manual,
user guide, or textbook that in print had a back-of-the-book index, in the digital or online medium
consists of multiple files for each section or unit and for each media asset,
which are indexed and thus retrieved by taxonomy terms instead of using the
back-of-the-book index.
Index Entries for Taxonomy Terms?
I have worked on projects were printed content (books,
manuals, etc.) were digitized and put into small chunks or files to be indexed
with a taxonomy, and the original printed volume had a back-of-the-book index.
So, the issue arose: to what extent should the legacy back-of-the-book index be
utilized when developing the new digital retrieval taxonomy? I had access to the index for candidate
taxonomy terms and was encouraged to utilize it.
My conclusions have been that the back-of-the-book index
serves a slightly different purpose for users than does an indexed taxonomy. A
back-of-the-book index serves to locate the page where something was mentioned
on a specific topic. Users of a reference work, however, may at other times consult
the table of contents to navigate and find the relevant sections and
sub-section. A taxonomy serves a purpose that is both, or something in-between,
that of a table of contents and a back-of-the-book index. It’s for searching
(like in an index) and also for navigating (like in a table of contents), but it
points to the subsection level (as in a detailed table of contents), not to a
page (as in an index). Also more content is expected to be linked to a taxonomy
term (a section unit, and often multiple such units) than content indicated by
an index entry (as little as one sentence). So, it would not be right to use
all or most of the main entries of a back-of-the-book index to create a
taxonomy for the same content.