Sunday, March 28, 2021

Industry Uses for Taxonomies

It’s always interesting to hear about new and different uses of taxonomies. For example, recently I learned that a company would like a taxonomy in a way I had not heard before: to help their RFP team find content more efficiently to put together its responses to RFPs (request for proposals) of prospective clients. Inquiring about my course recently someone wrote: “Typically, I work with creative asset manager... However, I’m fascinated by the use and application of Taxonomies in broader industries.” So, today I will address the use of taxonomies in various industries.Information teechology and services graph

In a broad sense, taxonomies are most often used to support information management and retrieval. With the features of controlled concepts, synonyms, and structure, taxonomies provide better results than full-text search, and they also provide guidance to users.

Taxonomies may be used to support both users within an organization and those who are external, sometimes with the same taxonomy and the same content, but often with different taxonomies and different content, and occasionally with different taxonomies for the same (or subset of the same) content.

The information management use of taxonomies (content lifecycle management, content reuse, data analysis, etc.) is an internal use of taxonomies by employees of an organization with respect to internal content. The information retrieval use of taxonomies pertains to both internal and external users of taxonomies. Internal users who make use of taxonomies for information retrieval do so to better find information within the organization to do their job. External users of taxonomies use them to help find published information or content, products or services, jobs, activities, etc. within a site or online service.

Taxonomies for external users of content and information

Organizations or agencies where information publishing or sharing for external audiences is core to their business or mission have long recognized the importance of thesauri or taxonomies. This began with periodical/journal articles and has since spread to include all kinds of content and data. Examples include the following:

  • News or other online report publishers, library/research database vendors, or other subscription content services.
  • Organization where public or member information is significant, including government agencies with content-rich websites, international organizations, non-profit organizations, and professional and trade associations.
  • E-commerce and marketplaces, including B2C, B2B, and C2C (marketplaces), and product information sharing between manufacturers, distributors, wholesalers, and retailers.
  • Educational publishers or education technology companies, which provide digital learning.
  • Services selling or distributing digital media, such as movies, music, ebooks, images, animations, video clips, graphics, etc.
  • Job board websites or sites that match consultants/contractors/freelancers to projects.
In addition, with the growth of content marketing, which is the use of web-based content by companies and organizations to attract visitors to their sites, the content of websites has grown immensely, with more pages, blog posts, posted media files, documents to download, etc. To help website visitors find information and content on their sites, taxonomies have now become important for organizations in all kinds of businesses or services.

In all of these cases of publishing content for external users of taxonomies, there continues to be an internal use of taxonomies as well, for managing the content, including content reuse, rights management, retention/lifecycle management, quality management, new content integration, multilingual management, etc.

Taxonomies for internal users of content and information

Over the past decades the amount of internal digital information, and the applications in which they are contained, in organizations has grown exponentially. Taxonomies can aid in the management and retrieval of information and content items in content management systems, document management systems, digital asset management systems, collaboration spaces, intranets, etc. This applies to all industries, although in some sectors the management of internal information or assets is especially critical.

  • Media, entertainment, advertising, marketing, as industries or functions that deal with large volumes digital assets or media (images, videos, audio files) that need to be managed.
  • Highly regulated industries, such as pharmaceuticals, banking and finance, energy, and telecommunications, which need to manage documents, information, and data better to help with regulatory compliance.
  • Manufacturing, technology, engineering, R&D, and related industries which have large volumes of technical documentation, manuals, policies, and procedures that have become digitized.
  • Organizations such as professional service or research-focused firms that have critical content management tasks, such as internally publishing reports, proposals, or presentations which involve a degree of content reuse.

In addition, large companies in any industry now have so much content that taxonomies have become valuable to in helping their employees find the information they need quickly, whether on an intranet or other enterprise content management system. Having content in multiple systems could lead to multiple taxonomies, so a centrally managed taxonomy that is kept in sync with multiple types of content systems is a recommended strategy.
 

Saturday, February 6, 2021

Who Should Create Taxonomies?

Taxonomy word cloud

More and more organizations of various types and sizes are recognizing the benefits of information/content taxonomies, to make it easier to more accurately and quickly find information, be recommended information, and be able to formulate complex queries of data.  

In many cases, however, where taxonomies are not central to the product/service of a company (such as e-commerce retail or information publishing) or function of an organization (such as research), the task of creating and maintaining a taxonomy is not big enough to justify hiring a professional taxonomist. Creating a taxonomy is a temporary project, and then updating it is often a part-time task, which could even be shared among several people.

Taxonomy creation should not be underestimated, however. It may appear easy to create a taxonomy, but it is not easy to create a good taxonomy. If a taxonomy is not well-designed it cannot serve its purpose well. You may as well rely on a search engine alone than try to utilize a bad taxonomy.

Not creating the taxonomy yourself

Some approaches to developing a taxonomy without a dedicated taxonomist include using existing taxonomies, creating a taxonomy by term extraction, or hiring a consultant.

Reusing existing taxonomies

To serve its purpose best, a taxonomy should be custom-created to serve its content, users, and system. An existing external taxonomy is usually not adequate. It may be suitable for limited scope of a geographic taxonomy, industrial classification, a list of organization names, a list of languages. More information about licensing taxonomies is in my blog post “Taxonomy Licensing”  Even when using an existing taxonomy, there is still work to edit and adapt the external taxonomy, which requires taxonomy expertise

Creating a taxonomy by automatically extracting terms from content

Software, including some taxonomy management software, such as PoolParty, can extract candidate taxonomy terms from a body of content (documents or web pages) that is intended to be tagged with the taxonomy. This is an effective method to enhance a taxonomy, to add misting concepts and alternative labels (synonyms). However, this is not a practical way to start creating a taxonomy, which requires a logical structure. Taxonomy-creation expertise is still needed.

Hiring a taxonomy consulting or temporary contractor

This is a good idea. A consultant or contractor will provide a combination of guidance and actual taxonomy building, although a consultant tends to provide more guidance, and a contractor tends to do more taxonomy building. A contractor requires a certain time commitment, such as 3-6 months full-time, whereas there is lots of flexibility in engaging a consultant. After the consultant or contractor is finished, though, someone needs to maintain and update the taxonomy to the same specifications.

When a taxonomy is not very large, it may be more efficient and cost-effective to create it from scratch oneself without reusing an existing taxonomy or relying on a consultant or contractor, although getting a consultant to at least review the taxonomy might still be a good idea.

Taxonomy management as part of a role

What is much more common for an organization than to have a taxonomist is to have one or more positions where taxonomy management is part of the job description. Searches on web job boards return hundreds of job opening with “taxonomy” in the job description, whereas only a small fraction of them have taxonomy or taxonomist in the job title. Common job titles include:  Content Designer, Content Manager, Content Strategist, Data Architect, Data Catalog…, Data Strategist, Digital Asset Manager, Digital Content…, Digital Librarian, Information Architect, Information Scientist, Knowledge Engineer, Knowledge Management…, Metadata Specialist, Product Manager, SharePoint Developer, Solutions Architect, etc. There are also positions more centered in marketing and in web development.

Often, though, the need for a taxonomy emerges at a time when a new position is not created, so an existing employee must take on the task. This common scenario is behind the title of my book and this blog, The Accidental Taxonomist. Those that take on taxonomy work may come from a wide variety of roles or departments including marketing for a website taxonomy, IT or human resources for an intranet taxonomy, IT for content/document management systems administration, and technical documentation/publishing. Knowledge management and metadata/data management are also good candidate roles for taxonomy management.

In situations where the taxonomy is used to manage and retrieve content in specialized subject areas, subject matter experts may also be involved in taxonomy creation, at least for the parts of the taxonomy that correspond to their expertise. 

Not having sufficient taxonomy skills

In either case, whether taxonomy management was originally part of the job description or not, people who assume partial taxonomy responsibilities often do not have the skills. This is usually the case when a taxonomy project first arises. Even when someone is newly hired, successful applicants may not to meet all job description duties, such as taxonomy experience, especially if the skill is only a minor part of the job.

Related job skills may make it easier to created taxonomies, but without experience or training, one cannot simply create a good taxonomy. Related skills tend to be in the area of library/information science, indexing, information architecture, digital asset management, content management, records management, and possibly product management.

Librarians tend to have training in cataloging and classification, sometimes in thesaurus creation, and less likely in taxonomy creation. Taxonomies resemble classification schemes, but function differently, so it would be a mistake to model a taxonomy as a classification scheme. See my blog post "Classification Systems vs. Taxonomies." I had taught a continuing education course on taxonomies through a graduate school of library and information science for years, since MLIS graduates had not learned taxonomies as part of their degree program.

Information architects know how to organize information in a web user interface well, so they may have a good sense on how to structure a taxonomy at a high level. However, there are details and nuances of a large taxonomy, such as the development of synonyms/alternative labels, with which they may not have experience. Also, a taxonomy should not be confused with a navigation scheme, as explained my blog post "Navigation Schemes vs. Taxonomies."

Digital asset managers, content managers, and product managers know about the metadata management for their content, and taxonomies usually fit into the larger metadata scheme. However, their experience with taxonomy creation is usually limited to a subject area and the context and constraints of the system in which they are working. So, the very basic taxonomy skills that they develop may not be transferable to another system or another subject domain.

Subject matter or domain experts, including product managers, often play an important role in taxonomy development. From my experience in working with subject matter experts, though, they often tend to design more of a classification scheme for their domain and create taxonomy concepts that are too granular to be practical for end-using search and retrieval.


Where to learn taxonomy skills

There are many continuing education options to learn taxonomy creation, some through library/information science schools, some through professional associations, and some through commercial conference and training programs. I have been providing taxonomy training since 2007, through online courses, conference workshops, and corporate workshops, both in-person and virtual. I have been impressed with the diversity of backgrounds, job roles, organization types, and global locations of the workshop participants over the years.

The current situation of all-virtual conferences means that I am teaching more virtual workshops than usual this spring, and they are accessible to more people. Following is a list of upcoming live virtual taxonomy workshops, all with interactive participation, and thus with limited enrollment. They vary slightly in their focus and scheduling. All times indicated are Eastern.

"Taxo Update: Latest in Designing & Maintaining Taxonomies"
Monday, March 22, 12:00 - 4:00 pm ET (4 hours)
A preconference workshop of Computers in Libraries with separate registration (no need to register for the entire conference)

"Taxonomy and Metadata Design"
Monday-Tuesday, March 29-30, 10:00am - 2:00pm ET each day (8 hours over two days)
Through Technology Transfer, Rome (with the availability simultaneous interpretation and slides translated into Italian).

"Connecting Users to Content: An Introduction to Taxonomy Design & Creation"
Wednesday-Friday, April 21-23, 2:00-4:00 pm EDT each day (6 hours over three days)
A preconference workshop of the IAConference with separate registration (no need to register for the entire conference)

Wednesday, January 20, 2021

Hierarchies in Taxonomies, Thesauri, Ontologies, and Beyond

Hierarchies are a defining feature of taxonomies, and they are also characteristic of other controlled vocabularies or knowledge organization systems, such as classification schemes, thesauri, and ontologies. The problem is that the definitions and rules for hierarchies vary depending on the kind of knowledge organization system, so you cannot assume that a hierarchy in one system converts to a hierarchy in another system.

“Hierarchy” can have various types and uses. Not all kinds of hierarchies are reflected in even in taxonomies, which tend to be quite flexible. The rules are stricter when it comes to thesauri. Finally, in ontologies, there is only one kind of hierarchy.

The hierarchies permitted in thesauri are specified in the ANSI/NISO Z39.19 and ISO 25964-1 standards, as a reciprocal inverse relationship pair of Broader term (BT) / Narrower term (NT). There are three kinds specified in these standards:

  • Generic-specific   which refers to “is a” or “are a kind of”
        Example:
        Basketball is a kind of sport.
        Basketball BT Sports;  Sports NT Basketball
        Baketball has broader concept Sports; Sports has narrower concept Basketball
  • Generic-instance – which refers to “is a named entity instance of”
       Example:
      Michael Jordan is a named basketball player.
      Jordan, Michael BT Basketball players; Basketball players NT Jordan, Michael
      Jordan, Michael has broader concept Basketball players; Basketball players has narrower concept Jordan, Michael
  • Whole-part – which refers to “is in” or “is an integral part or component of”
       (not to be confused with “part” as a participant taking part in, or member of)
       Example:
       Locker rooms are in athletic facilities.
       Locker rooms BT Athletic facilities; Athletic facilities NT Locker rooms
       Locker rooms has broader concept Athletic facilities; Athletic facilities has narrower concept Locker rooms

The types of hierarchies permitted in taxonomies include all of those designated for thesauri, plus a little more flexibility due to the absence of the associative relationships. In thesauri, if the relationship between a pair of concepts is better described as associative (“Related term” - RT) than hierarchical, then they cannot be hierarchically related. In a taxonomy which lacks associative relationships, in some cases a relationship that is not accepted as hierarchical in a thesaurus may be accepted as hierarchical in a taxonomy. An example is the pair of concepts Stress and Stress management. Technically, the relationship between these two concepts is associative and not hierarchical, because Stress management is not a kind of or a part of Stress. But in a taxonomy (not a thesaurus), designating Stress management as a narrower concept of Stress may be acceptable.

As for classification schemes, despite their name, they do not always conform to class-subclass (as "is a kind of") conventions. For example, in the Dewey Decimal Classification system, 910 Geography & travel comes under 900 History. But geography and travel are not kinds of/sub-categories of history. Classification schemes may have a tendency to force a hierarchy when it’s not really an accepted taxonomic hierarchy.

Despite the looser rules for hierarchies of taxonomies and classification schemes, there are also kinds of hierarchies that are not taxonomic hierarchical. These include organizational chart hierarchies, hierarchies of (military) rank, family tree hierarchies, the ordering of social sciences concepts of as Maslow’s hierarchy of needs, or Bloom’s Taxonomy of learning objectives. The hierarchies in these cases are not broader/narrower, but rather reflect importance, influence, sequence, or some other aspect of the notion of hierarchical order. In taxonomies and thesauri, concepts in such organizational hierarchies need to be treated instead as siblings at the same level all sharing the same broader concept, such as Learning objectives as the single broader concept for all six of Bloom's learning objectives, Needs as the single broader concept for all five of Maslow's needs, Military ranks as the single broader concept for all ranks, or Job titles as the single broader concept for all job titles.

Finally, in ontologies, hierarchies may be of less significance, but they are still a feature. While relations between concepts/entities are “semantic,” with specific descriptive labels, and thus are not necessarily hierarchical, there are may be hierarchical relations between classes, when designating subclasses of classes. However, the kind of hierarchical relationship that is created between ontology classes and subclasses is limited strictly to the generic-specific type, for “is a kind of.”

Conclusions

These distinctions in hierarchies have ramifications if you want to combine, import, or convert one knowledge organization system to another. When converting a thesaurus to a taxonomy, it is possible that some of the associative relationships could be accepted as hierarchical. When converting a taxonomy to a thesaurus, existing hierarchical relationships should be reviewed to see if any should be converted to associative.

Converting a taxonomy or thesaurus to an ontology would require identifying and remove whole-part hierarchical relationships (and adding new broader concept relations to the orphaned concepts) and converting generic-instance hierarchical relationships to class-individual relationships rather than class-subclass. In fact, this may involve so much effort, which cannot be automated, that the better approach to converting a taxonomy to an ontology is probably to apply a more generic ontology as a layer to the existing taxonomy/thesaurus, which some software tools, such as PoolParty, support. Extending a taxonomy into an ontology is the subject of my next conference presentation “Ontology Design by Enriching Taxonomies” at the Data-Centric Architecture Forum on February 3.