Saturday, June 20, 2020

When a Taxonomy Should not be Hierarchical


The traditional taxonomy is hierarchical. Thus, after it is determined a taxonomy is needed, often it is thought that it should be designed as a hierarchy. However, in practical terms, a hierarchical taxonomy might not be the kind that is appropriate.

A taxonomy provides value (1) as a controlled vocabulary of concepts to support consistent tagging and comprehensive, accurate retrieval of content and (2) by having some organized structure of these concepts to guide users to desired concepts. That structure is traditionally a hierarchy, but increasingly we are seeing a slightly different structure, which is faceted. Facets define different aspects, types, issues, dimensions, etc., by which content may be classified and then organizes the taxonomy concepts (terms) into those facets. Examples of facets could be document type, location, function, department, audience, subject discipline, line of business, etc., as the needs of the content dictate. It is certainly possible to have a hierarchy of concepts within a facet, but with a well-designed set of facets, the addition of hierarchy may no longer be needed.


       Hierarchical Taxonomy Structure    vs.    Faceted Taxonomy Structure

Recently I had a small consulting project where I was asked to make recommendations and improvements on newly created taxonomy, including putting the Topics into a hierarchy. There were only 68 Topics (besides other facets). I made changes that involved over half of the terms, including deletions, additions, name changes, and moving terms from/to the Industries facet, but in the end, there were about the same number of Topic terms. However, although I made significant improvement to the Topics taxonomy, I did not feel it was needed or practical to put the terms into a hierarchy, even though the client had initially made that request. The small size, the type of display, and the nature of the terms were all reasons not to have a hierarchy.

Following are reasons not to have a hierarchy:
  • The term set in question is not that large and can easily be browsed (even with some scrolling) without a hierarchy to organize it.
  • The hierarchy will not display (or not display well) to the end-users, who might, for example just have a small scroll box or a type-ahead or auto-complete search on the taxonomy terms.
  • It is not easy or possible according to hierarchical relationship standards to put most terms into a hierarchy. For example, the term set in question is a collection of common tags/keywords/topics that occur in the content but are not necessarily related to each other, so it would be difficult to include all of them in a hierarchy, and the only way to create a logical hierarchy would be to introduce additional broader/category terms which are not practical to use for tagging.
  • Putting only some terms into hierarchical relationships results in a non-intuitive top-level display comprising both specific terms and categories (of narrower terms) at the same level.
  • Your user research indicates that users (including taggers) prefer type-ahead or auto-complete search on the taxonomy terms, rather than drilling down through hierarchies.
When the taxonomy is displayed to the user through a scroll box, and only a limited number of terms, such as 5-10 may be displayed at once in the scroll box display, it’s easier for a user to scroll and select terms from a list of 50-60 terms, if the terms are in an alphabetical list rather than in if they were in a hierarchy. Actually, hierarchies are not designed to be scrolled but rather to be expanded from top down in their tree structure.  Expanding a hierarchical taxonomy (such as clicking on plus signs next to terms), might be a feature in the taxonomy management system or in the tagging interface, but it is less common in end-user interfaces. Expandable tree hierarchies might not even be desirable in the end-user interface, since it takes the user more time and effort to find a term that way. Most end-users want to get to the content as quickly as possible rather than spend time exploring a taxonomy.

A number of content management systems and the SharePoint Managed Metadata Term Store support the creation of individual terms sets or facets and hierarchies within those facets.  So, for the less experienced taxonomist, it may seem logical to make full use of a system’s feature to support hierarchical taxonomies. Just because a taxonomy can be created as a hierarchy, however, does not mean it always should be created as a hierarchy.  I have seen awkwardly deep hierarchies created by non-taxonomists in content management systems.

Hierarchies should be created if they serve a purpose. Following are some likely purposes for taxonomies:
  • Making it easier for the end user to quickly identify the concept they want for retrieving content.
  • Educating users (such as students) on the hierarchical structure of a subject area.
  • Providing context to terms for manual indexers/taggers so that they apply the correct term. (Such a hierarchy need not be displayed to end-users, though.)
  • Providing the context of a broader concept to aid in auto-classification. 
  • Allowing a term to retrieve not only what was tagged to it, but also what was tagged to each of its narrower terms. (Such a hierarchy need not be displayed to end-users, though.)
Even if a pair of concepts has an inherently hierarchical relationship between each other, according to thesaurus standards (ANSI/NISO Z39.19 or ISO 25964-1), it does not mean that they must be put into a hierarchy in a taxonomy, if you’ve decided to avoid creating hierarchies and especially if what you are creating is a simple taxonomy and not a thesaurus.