Thursday, August 30, 2018

Taxonomy Hierarchical Relationship Issues

A common feature of taxonomies is the hierarchical relationship between terms. Terms are linked to each other in a relationship that indicates that one is the broader term (BT) of the other, and in the other direction, one is the narrower term (NT) of the other. You don’t need to be a taxonomist to understand this basic principle. However, even taxonomists can be challenged sometimes in determining whether it’s correct two put two terms in a hierarchical relationship.

Standards for Hierarchical Relationships

There are guidelines for the hierarchical relationship provided by the standards of ANSI/NISO Z39.19-2005 (R2010) Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies and ISO 25964-1: Information and Documentation — Thesauri and Interoperability with other Vocabularies — Part 1: Thesauri for Information Retrieval. The standards say that in a correct hierarchical relationship the term that is narrower to the broader term may be a specific type of the generic broader term, a named instance of the generic broader term, or an integral part of the whole broader term.

These standards, however, are for thesauri, not taxonomies. Thesauri have additionally a non-hierarchical associative relationship between terms, known as “related term” (RT). In taxonomies which lack related-term relationships, the conditions under which the hierarchical relationship is permitted need not be followed quite as strictly. Nevertheless, the thesaurus standards for creating the hierarchical relationship should be the starting point and the default for hierarchical relationships in taxonomies.

Challenges in Coming up with Broader Terms

Hierarchical taxonomies may be created from the top down, the bottom up, or a combination of both approaches. The top-down approach involves creating broadest categories first, then adding narrower terms and then adding narrower terms to narrower terms. This approach makes it easier to create good hierarchical relationships. In reality, though, we don’t always create terms based purely on their broader terms. Rather, analysis of content yields specific terms that are needed, so some degree of bottom-up taxonomy creation takes place. In the bottom-up approach there may be the challenge of determining and creating the appropriate broader term.

When I have been completely challenged in coming up with a broader term, I admit I have looked up the term in Wikipedia to see what are named as “Categories” for that term, listed at the bottom of the page. “Categories” implies a broader term, but these are not necessarily good or correct broader terms. An example of Categories that are not exactly broader terms is for the term Stress management: Stress, Management by type, Psychotherapy, and Psychiatric treatments. Stress management is not exclusively done as (is a part of) Psychotherapy or Psychiatric treatments, so those are not suitable broader terms. “Management by type” is definitely not a good taxonomy term, and the term Management alone has a different meaning of its own. As for the term “Stress,” this is more complicated. Technically, Stress management is not a kind of Stress or a part of Stress, so Stress should not be its broader term.  If this were in a thesaurus, they would definitely be related terms. If your controlled vocabulary is not a thesaurus, and the related-term relationship is not supported, then you may ignore the thesaurus rule in this case, and make Stress the broader term of Stress Management. This relationship is likely to be expected and accepted by users.

Challenges in Special Circumstances

Even creating a taxonomy from the top down taxonomists may encounter challenges or confusions with the hierarchical relationships. One challenging case is the concept of membership. Things and their members could be industries and their companies or international organizations and their member countries. It may seem logical to list the affiliate members “under” the industry or organization of which they are a part, but this is based too much on context and time. Companies can change their industries, and countries can change their international organization affiliation. More significantly, the whole-part hierarchical relationship is about integral parts, not participatory taking “part.” Finally, it may be more practical to put each type (companies, industries, companies, organizations) in a separate facet and not establish any relationship between them in a taxonomy (in contrast to a thesaurus or ontology).

Another potentially confusing case involves occupations and job titles. The subordinate nature of narrower terms should not be confused with the subordinate role of one job title to another. Thus, while a marketing specialist reports to a marketing manager, Marketing managers is not a broader term of Marketing specialists. Furthermore, while a marketing manager reports to a marketing director, we might make the hierarchical relationship in the other direction, with Marketing Directors as a narrower term to Marketing Managers, because directors are a kind of manager. Managers include directors.

Perhaps the most confusing case involves specificity which is not taxonomical specificity. For example, the Syllabi (plural of syllabus), as instructional outlines, in a certain sense are more specific than Curricula (plural of curriculum), which are also kind instructional outlines. Syllabi are for individual courses, and curricula are for a series of courses, such as an entire program of study or degree. Thus, it might seem logical that Syllabi would have the broader term of Curricula. But a syllabus is neither a specific type of curriculum, nor is it part of a curriculum. It is something different. So, it would be better not to have Curricula as a broader term of Syllabi, even in a taxonomy that is lacking related-term relationships.

Parent-Child Confusions

Sometimes the hierarchical relationship is referred to as “parent-child.” While it’s correct that a subsidiary company is a narrower term of its parent company, because it is part of the parent company, a biological child is not a narrower term if its parent, because it is not a part of the parent, but rather an offspring. To avoid confusion, it’s better to describe the relationship as broader/narrower, rather than as parent/child.

