Tuesday, October 6, 2015

Taxonomies and Tables of Contents

A table of contents and a hierarchical taxonomy appear to be quite similar. In my last blog post I looked at taxonomies and indexes, and in the end concluded: “A taxonomy serves a purpose that is both, or something in-between, that of a table of contents and a back-of-the-book index. It’s for searching (like in an index) and also for navigating (like in a table of contents), but it points to the subsection level (as in a detailed table of contents), not to a page (as in an index).” Taxonomies, especially the thesaurus kind, have many similarities to indexes when it comes to looking up a topic. Taxonomies, especially the hierarchical kind, are also similar to a table of contents or the navigation aid to a set of content.

Despite the apparent similarities in hierarchical structure and the the purpose of supporting browse navigation, the differences between a table of contents and a hierarchical taxonomy, however, are far greater than the differences between a displayed index and a search-supporting thesaurus.

A table of contents provides navigation, whether for a printed book or large document or for an electronic document or collection. In fact, in a MS Word document with headings, a table of contents that is generated in the left margin pane from those headings is called “Navigation.” Labels in a table of contents or navigation system are arranged like a taxonomy but are not exactly a kind of taxonomy.

Navigation is not a taxonomy

 

Navigation or a table of contents has to perfectly reflect the content that it belongs to. It is completely customized. Two books on the same subject cannot have the same table of contents.  The same taxonomy, however, may be used for more than one content source and typically is. In a table of contents or navigation, each navigation entry, menu label, or heading matches one-to-one to a single, specific section or web page.  Terms in a taxonomy are intended to be used more than once, so each term in a taxonomy is linked to multiple documents or content items.  As such, taxonomy terms need to be somewhat generic, whereas labels or headings in a table of contents or navigation can be specific. Taxonomy terms also need to be created with the anticipation of serving not only current content but also future content, whereas navigation or table of contents entries need only reflect the current content.

Different label wording 

In addition to being more generic, taxonomy terms differ from table of contents entries or navigation labels in other ways.

  • The names of chapters and headings may be longer descriptions (such as “Procedures to Enhance the Accuracy and Integrity of Information Furnished”), whereas taxonomy terms should be concise to aid skimming. A complex topic with a complex heading, can be covered with a combination of taxonomy terms instead of a single complex term, because taxonomy terms do not need to match all content one-to-one (such as the combination of terms: Information accuracy, Information integrity, and Information-gathering procedures).
  • The names of chapters and headings might be question phrases (such as “Why study statistics?”), whereas taxonomy terms should be nouns or adjective-noun phrases and start off with a “keyword” likely to be looked up (not “Why”) to support alphabetical lookup options. Even in a hierarchical taxonomy display, a list of terms at the same hierarchical level tend to be arranged alphabetically.
  • Table of contents entries may be context-specific based on the parent/broader level (such as “Identification and General Terms” or “Special Concerns”), and, in fact, the same sub-heading could repeat under different broader headings. In a taxonomy, each term should be independently unambiguous.
  • Table of contents often start off naming introductory information (such as “Introduction to Identity Theft”) or have sections for Conclusions, neither of which should be terms in a taxonomy. If the same topic is covered three times, in an introduction, body, and conclusions, it will be indexed with the same single taxonomy term, and the end-user will retrieve all indexed results on that topic grouped together.
  • Table of contents or navigation headings can be like titles, which may be “catchy” or enticing to the reader, especially at the top level. Taxonomy terms, by contrast, are clear, concise, and common (based on what most users would call the concept), and not especially creative.

Different structure

 

Tables of contents and taxonomies also differ in their structure. Tables of contents or navigation schemes reflect the organization of content, which may be chronological, pedagogical, from fundamental to detailed, from most important to least important, or the order of perceived user interest. In a taxonomy, the terms at each hierarchical level are arranged alphabetically by default. In a navigation there are no “related terms”, so what appear as subtopics might not be taxonomical narrower terms, but just related terms. Taxonomies, on the other hand, must follow the ANSI/NISO Z39.19 guidelines or ISO 25964 with respect to structuring hierarchical relationships: narrower terms bust be specific types, instances, or integral parts of their broader terms.  By having this standard format, a taxonomy provides organizational predictability for all kinds of users and all kinds of content.

There are certain editorial conventions for content, such as having units of a roughly standard length, which then impact the table of contents or navigation. While there are some variations, one chapter or section is typically not twice as long as another. To achieve balance, a large topic may be spread out over two or more sections, whereas several small topics are grouped together under a heading that is a serial list (such as “Poverty, Inequality, and Mobility”), or under “Other.” Thus, a table of contents topics are based on the amount of material presented. Taxonomy structure, on the other hand, looks at the terms/concepts only, and does not take into consideration the amount of content per term. There is once concept per term, not a list. Rare occurrences of two concepts combined into a single term, such as “Author voice and tone,” are the consequence of two topics being very closely related with overlapping meaning and usage.

Conclusions


While a table of contents or navigation system is not a taxonomy, nor should it be used as a taxonomy, when a legacy print source is converted to units of digital content, a table of contents is still an excellent source for creating a taxonomy.