Sunday, February 19, 2017

Avoiding Mistakes in Taxonomy Hierarchical Relationships

Perhaps the most important issue in designing a hierarchical taxonomy is creating hierarchical relationships between terms correctly. This makes the taxonomy intuitively easy to understand and navigate by all kinds of users, regardless of whether they have had any training on using a taxonomy.

The basic principles of the hierarchical relationship are described in the ANSI/NISO Z39.19 and ISO 25964-1 standards for thesauri. As a quick summary, the relationship is created between terms in the following circumstances:

  • a broader term which is generic and a narrower term which is a more specific type of the generic broader term,
  • a broader term which is generic and the narrower term is a named instance (proper noun) of the generic broader term,
  • a broader term which is a whole entity and a narrower term which is an integral part.

It is the first, generic-specific type that is most common, but is also most prone to errors by those not experienced in creating taxonomies. Typical errors include confusing refinement and narrower terms, too closely reflecting the source content hierarchy, and creating narrower terms that are applications, uses, or examples of a broader term.

Confusing Refinements with Narrower Terms

We envision users browsing a hierarchical taxonomy from top down, from broad topic to more specific topic. A more specific topic is a narrower term (NT) of a broader topic. However, instead of providing more specific topics, the creator of a taxonomy might mistakenly provide refinements of the broader topic, which are aspects of the topic, but not actually narrower terms. A term that is an aspect or refinement is not a unique stand-alone term/concept, but rather it is meant to be used in combination with its parent term.

An example of such an erroneous hierarchy would be:

  Eye diseases

Diagnosis is an aspect or refinement of Eye diseases (and of other disease-type terms), and not a narrower term. A narrower term would be specific type of eye disease:

  Eye diseases
     NT: Glaucoma

A refinement term might not be as obvious as it is in the above example. If the same term, however, appears duplicated as a narrower term to different broader terms, but with a different implied/contextual meaning in each case, this should be red flag that the duplicated narrower term is really a refinement term. For example, the duplication of the term Waiver in a legal taxonomy as:

  Objections to evidence

  Right to jury trial

In this case, the duplicate narrower term should be changed to be specific in each case, such as: Objections to evidence waiver and Right to jury trial waiver.

Novice taxonomists might create such incorrect broader term-narrower term relationships because they have seen them formed as such elsewhere, such as Library of Congress Subject Headings plus Subdivisions or back-of-the-book index main entries plus subentries. A subheading or a subentry is not the same as a narrower term, because a subheading or a subentry only has usage and meaning in the context of the main heading it is associated with (appears under). A taxonomy narrower term, on the other hand, is not a different kind of term, but is rather a description of a relationship between terms. The meaning of a term in a taxonomy is constant and not dependent on its location in the taxonomy.

Too Closely Reflecting the Source Content Hierarchy

Some taxonomies are based heavily on certain text sources, such as the table of contents of one or a limited number of books or manuals, where the text is structured into units, chapters, main heading sections, subheading sections, etc. It is thus natural to make use of the structure of the text as a basis for the structure of the hierarchy. But there can be issues.

In the following example of a chapter and its headings from a textbook, greater hierarchical structure is needed for the corresponding taxonomy terms, and one of the topics (Units of Measure) does not belong within this hierarchy.

  Microbiology Laboratory
  --Microbiology Lab Personnel
  --Introduction to the Microscope
  --Units of Measure
  --Types of Microscopes
  --Laboratory Staining Methods
  --Culture Media

These concepts may appear in a taxonomy arranged hierarchically as follows:

  Medical laboratory technology
  NT: Laboratory equipment and supplies
       NT: Culture media
       NT: Microscopes
            NT: Microscope types
  NT: Laboratory personnel
  NT: Microscope use
       NT: Microscopy stains
  NT: Serology

  NT: Measurements and calculations
       NT: Units of measure

Another issue is that, even when the the hierarchy from the source is acceptable, the subheading-based terms are short, generic, and without context. An example is as follows:

  Eye Medications
  --Anti-inflammatory Agents
  --Antiglaucoma Agents
  --Local Anesthetics

The only correct narrower term above is Antiglaucoma Agents, as the other terms are not specific to eye medications. They could be linked as related terms instead.

Applications, Uses, or Example-Type Terms

Relying too much on certain text sources for the taxonomy may also result in erroneously creating narrower terms for the applications, uses, or examples of the broader term concept, because the text presents content that way.

Following are several examples:

  Web Applications
  --Tourism and Travel
  --Higher Education
  --Financial Institutions
  --Software Distribution
  --Health Care

  Decision making issues
  --Ethical conflicts
  --Information sources
  --Intraorganizational conflicts
  --Social influences

  Globalization challenges
  --Cultural differences
  --Economic risk
  --Political risk
  --Managerial limitations

Each of these so-called narrower terms are merely examples within the context of the broader term. All "narrower terms" could have other uses beyond the context of the broader term. To make the hierarchy correct, either:
1) the relationship should be changed from narrower-term (NT) to related-term (RT). This would be the case, if these terms can logically exist elsewhere in the taxonomy. Also, indexing of the concepts may require a pair of terms (such as Globalization challenges AND Economic risk),
2) the narrower terms should be modified and clarified, such as Cultural challenges to globalization, Economic risk challenges to globalization, Political challenges to globalization, and Managerial challenges to globalization. This would be the case, if these terms did not exist elsewhere in the taxonomy.

In conclusion, hierarchical relationships need to be constructed independent of any sources for terms, and they need to be universal and not subject to certain contexts.