Sunday, July 19, 2020

How Many Facets Should a Taxonomy Have

I’ve given a rule-of-thumb of 3-8 facets to create in a faceted taxonomy, but it’s not that simple, and there are various factors to consider. Creating facets is an assignment in the online taxonomy course I teach, and a student recently submitted good set of facets with sample terms, but there were 12 of them. So, why might that be too many facets?

Schematic diagram of a set of four facets.

Consider the users.

Are the users internal trained employees who deal with content, most or all of the employees of an organization, external but repeat users such as partners or researchers, or the general public? Internal employees, especially those who are content managers or digital asset managers, who receive some training to become familiar with the facets, should be able to handle any number of facets. It is their job to classify and/or retrieve content by their facets, so they should have the time and inclination to go through a long list of facets. A broader cross-section of employees or external repeat users may have access to documentation but not read it, will likely not be trained, and are often more rushed when they deal with content, so a shorter list of facets would be more suitable. Finally, the general public is likely to use only facets that are easy to understand and fit into the window display (not requiring scrolling), so a relatively short list of facets is recommended for them.

Consider the content.

In addition to considering the shared attributes of the content, as you cannot create more facets than conceptually exist for the content, you need to consider the volume of the content. A relatively small collection of content items or assets does not need as many facets as filters than a larger collection of content does. If users select a term from each facet, they should not be getting zero results or just one or two items too often. Remember, the main use of facets is to filter and limit results down to a list that can then be easily browsed. If the user retrieves only one or two results, however, they will likely consider the search as too narrow and try again to broaden it.

Microbial Life Education Resources website facets
Microbial Life Educational Resources
facets are just enough to fill the
length of a computer monitor display.
Consider the user interface.

Sometime the taxonomy has influence over the user interface design, such as when it’s an internally designed research portal, but often content management system do not offer much flexibility in how facets are displayed. The first thing to consider is how many facets will be displayed by default in the initial screen view (without scrolling) in the most commonly used devices. If facets can be collapsed to show only the facet names and not any values/terms within them, then a greater number of facets can more easily be included. Hiding the values, however, might not be desired, since the display of sample values makes it clear to the user what the facets are for. 

Consider what constitutes a facet or filter.

What may be considered a “faceted taxonomy” is only a subset of all the possible metadata properties of the content. Some of the other, default non-taxonomy metadata (such as date, creator, file name or title, or file type) may also be desired as end-user filters alongside the taxonomy facets, which then further increases the number of filters or refinements displayed to the user, who sees no difference between taxonomy facts and non-taxonomy filters.

There is no strict definition of a “taxonomy facet.” I would say it is a facet whose values or terms must be created by a person, such as a taxonomist or metadata architect, rather than those that are system-generated. In addition, taxonomy terms are those that must be tagged to the content, rather than already being a part of content. For example, if File Format is based on the file extension, then it is already part of the content and need not be “tagged,” so it’s not a taxonomy facet by my definition.

A faceted taxonomy is more, though, than a single facet of topics alongside other non-taxonomy metadata. The idea behind creating a faceted taxonomy is to split up what could be a large hierarchical taxonomy into different aspects.  For example, instead of having a term Business service agreements that is in a hierarchy narrower to both Vendor contracts and Business services, you could have just the term Vendor contracts in the Document Type facet and Business services in the Business Type facet, and the combination of the terms from each facet will suffice.

Faceted taxonomies, more so than hierarchical taxonomies or thesauri, need to consider the factors of users, content, and user interface when it comes to their design.

Saturday, June 20, 2020

When a Taxonomy Should not be Hierarchical

The traditional taxonomy is hierarchical. Thus, after it is determined a taxonomy is needed, often it is thought that it should be designed as a hierarchy. However, in practical terms, a hierarchical taxonomy might not be the kind that is appropriate.

A taxonomy provides value (1) as a controlled vocabulary of concepts to support consistent tagging and comprehensive, accurate retrieval of content and (2) by having some organized structure of these concepts to guide users to desired concepts. That structure is traditionally a hierarchy, but increasingly we are seeing a slightly different structure, which is faceted. Facets define different aspects, types, issues, dimensions, etc., by which content may be classified and then organizes the taxonomy concepts (terms) into those facets. Examples of facets could be document type, location, function, department, audience, subject discipline, line of business, etc., as the needs of the content dictate. It is certainly possible to have a hierarchy of concepts within a facet, but with a well-designed set of facets, the addition of hierarchy may no longer be needed.

       Hierarchical Taxonomy Structure    vs.    Faceted Taxonomy Structure

Recently I had a small consulting project where I was asked to make recommendations and improvements on newly created taxonomy, including putting the Topics into a hierarchy. There were only 68 Topics (besides other facets). I made changes that involved over half of the terms, including deletions, additions, name changes, and moving terms from/to the Industries facet, but in the end, there were about the same number of Topic terms. However, although I made significant improvement to the Topics taxonomy, I did not feel it was needed or practical to put the terms into a hierarchy, even though the client had initially made that request. The small size, the type of display, and the nature of the terms were all reasons not to have a hierarchy.

Following are reasons not to have a hierarchy:
  • The term set in question is not that large and can easily be browsed (even with some scrolling) without a hierarchy to organize it.
  • The hierarchy will not display (or not display well) to the end-users, who might, for example just have a small scroll box or a type-ahead or auto-complete search on the taxonomy terms.
  • It is not easy or possible according to hierarchical relationship standards to put most terms into a hierarchy. For example, the term set in question is a collection of common tags/keywords/topics that occur in the content but are not necessarily related to each other, so it would be difficult to include all of them in a hierarchy, and the only way to create a logical hierarchy would be to introduce additional broader/category terms which are not practical to use for tagging.
  • Putting only some terms into hierarchical relationships results in a non-intuitive top-level display comprising both specific terms and categories (of narrower terms) at the same level.
  • Your user research indicates that users (including taggers) prefer type-ahead or auto-complete search on the taxonomy terms, rather than drilling down through hierarchies.
When the taxonomy is displayed to the user through a scroll box, and only a limited number of terms, such as 5-10 may be displayed at once in the scroll box display, it’s easier for a user to scroll and select terms from a list of 50-60 terms, if the terms are in an alphabetical list rather than in if they were in a hierarchy. Actually, hierarchies are not designed to be scrolled but rather to be expanded from top down in their tree structure.  Expanding a hierarchical taxonomy (such as clicking on plus signs next to terms), might be a feature in the taxonomy management system or in the tagging interface, but it is less common in end-user interfaces. Expandable tree hierarchies might not even be desirable in the end-user interface, since it takes the user more time and effort to find a term that way. Most end-users want to get to the content as quickly as possible rather than spend time exploring a taxonomy.

A number of content management systems and the SharePoint Managed Metadata Term Store support the creation of individual terms sets or facets and hierarchies within those facets.  So, for the less experienced taxonomist, it may seem logical to make full use of a system’s feature to support hierarchical taxonomies. Just because a taxonomy can be created as a hierarchy, however, does not mean it always should be created as a hierarchy.  I have seen awkwardly deep hierarchies created by non-taxonomists in content management systems.

Hierarchies should be created if they serve a purpose. Following are some likely purposes for taxonomies:
  • Making it easier for the end user to quickly identify the concept they want for retrieving content.
  • Educating users (such as students) on the hierarchical structure of a subject area.
  • Providing context to terms for manual indexers/taggers so that they apply the correct term. (Such a hierarchy need not be displayed to end-users, though.)
  • Allowing a term to retrieve not only what was tagged to it, but also what was tagged to each of its narrower terms. (Such a hierarchy need not be displayed to end-users, though.)
Even if a pair of concepts has an inherently hierarchical relationship between each other, according to thesaurus standards (ANSI/NISO Z39.19 or ISO 25964-1), it does not mean that they must be put into a hierarchy in a taxonomy, if you’ve decided to avoid creating hierarchies and especially if what you are creating is a simple taxonomy and not a thesaurus.

Wednesday, May 20, 2020

Navigation Schemes vs. Taxonomies

A navigation scheme for a website/intranet and a taxonomy are similar, but they are not the same. I had taken an interest in website information architecture, around 16 years ago, around  the same time I became familiar with the term “taxonomy” (although I had already been working for years as a “controlled vocabulary editor”), so I naturally related website navigation and taxonomy. In an earlier version of the online course I teach on taxonomies, I had even presented examples of website navigation schemes as examples of taxonomies. However, I also recall hearing early on in conference presentations of the consultant Seth Earley that a navigation is not a taxonomy. After more years of experience with taxonomies, I came to recognize that as true. Considering a website navigation structure as an example of a taxonomy is an oversimplification and could lead to poor taxonomy design.

I looked more closely into the comparison of website navigation and taxonomies in preparation to present at World IA Day Boston on February 22 (presentation slides PDF, presentation video). IA stands for information architecture. So, now I will continue with that line of observations here. This topic also follows another example of what a taxonomy is not. It is not a classification scheme, which I also discussed in a recent blog post, “Classification Systems vs. Taxonomies.”

A navigation scheme is typically presented as a set of menus and submenus and possibly also a supplemental site map, although the latter has become far less common on websites than it used to be. The navigation scheme of a website, intranet, or portal reflects the structure of the content, which has been designed in a way to serve various sets of users (defined by generic “personas”) with various common kinds of tasks, such as finding people, reports, events, office locations, financial data, etc., submitting requests, providing feedback, and placing orders, among others.

The one area where navigation and taxonomy may overlap is in a website where the content is entirely publication-like articles or documents. In this case, the site navigation is just for finding articles or documents based on their subject matter, so a topical taxonomy for indexing and retrieving documents may appear as the navigation menu for the site.  This is the case for news media sites, for example.

With a background in indexing, I like to compare the index of a book with the taxonomy-enhanced search capabilities of a website, whereas the table of contents of a book is like the navigation scheme. A table of contents or navigation scheme is a higher-level, pre-defined structure of content, that guides users to the general organization of content and tasks. It helps users understand the scope of the content available, provides guidance on where and what content to find, and aids in exploration. An index or search feature, including faceted search, on the other hand, enables to user to find specific information or content items of interest. A taxonomy, regardless of its display type, serves the function of an index, not the table of contents. I have also compared taxonomies with tables of contents in a blog post several years ago, “Taxonomies and Tables of Contents.” 

Even when a taxonomy is hierarchical, it differs from a navigation scheme or table of contents, because it is an arrangement of terms/topics/concepts. By contrast, the navigation (or table of contents) is an arrangement of named content (named pages/sections, etc.). This is key. Terms, topics, or concepts (the distinctions between which are beyond the scope of this discussion), while reflecting the content of the website or intranet, are somewhat generic, can apply to any content in the site, and whose meaning should  be understood independent of the location in taxonomy hierarchy. Think of the tagging aspect of taxonomies.  Any hierarchy that the taxonomy terms are arranged in reflects the meanings of the terms and the relationships of the terms to each other. It does not reflect the arrangement of the content. Navigation menu labels, on the other hand, are short descriptions of pages or sections (with landing pages), which they match one-to-one, and the hierarchy of the menu reflects the hierarchical structure of the content.

The following table lists the various differences between navigation schemes and taxonomies.

Navigation schemes Taxonomies
Single-site use and implementation May be re-used in multiple implementations
Reflect the site-map structure Reflect organic relations of the topics/concept
Labels based on page titles Labels based on concepts/topics
Designed to be browsed hierarchically, top-down Designed to be browsed, searched, or may not be fully displayed to users
2-3 level hierarchy limit Options for deeper hierarchy and/or facets
One-to-one label-to-page One-to-many label to multiple pages
Do not include or link to all pages Cover all pages or content
Limited in size Can be small or large
Biased to emphasize what is important Neutral to topic importance
Not so flexible for updating Can grow and adapt without limits
Have paths and links, not metadata Concepts are often metadata

What may be confusing is if we think of taxonomies purely has hierarchical structures and thus equate them with navigation schemes, which are also hierarchical. The feature of being hierarchical does not make something a taxonomy, as I explained in the “Classification Systems vs. Taxonomies.”  Although a taxonomy may be hierarchical, there are other kinds and displays of taxonomies. Taxonomies may be fully displayed for browsing as hierarchical or alphabetical, displayed in excerpts in facets, displayed as short lists of terms in type-ahead or search-suggest features, or not displayed at all as a search thesaurus (also called a synonym ring).  The idea that taxonomies do not have to be hierarchical will be the topic of my next blog post.

Tuesday, April 28, 2020

Remote Taxonomy Work

Taxonomy design and development work can be done remotely. In fact, I’ve been doing taxonomy work remotely for the majority of my 24 years in the field, both as an employee and an independent consultant/contractor/freelancer. Now that information, knowledge, and content professionals are currently all working from home, this would be a good time to share observations on the needs of taxonomy work.

Depending on the nature of the taxonomy work (maintenance, revision, or a new taxonomy project), it can either easily be done completely remotely or mostly remotely with ideally some in-person time.

A staff taxonomist who is responsible for updating and maintaining a taxonomy, if experienced and requiring no training, can easily work 100% remotely. My first role in taxonomy (a controlled vocabulary editor for what was then Information Access Company, before it was merged into Gale) allowed me to go fully remote back in 1998 when I relocated across the country to be near family. I rejoined that group a decade later, and when I left again last July, the majority of the vocabulary editors were working remotely (partly due to an office closure). As experienced professionals, we could all work rather independently, and while contact in the office is nice, it was certainly not required. We edited the controlled vocabularies in a multi-user web-based taxonomy management tool, each of us with different areas of responsibility. Our team meetings were on Zoom, and we were encouraged to use video, so we stayed well connected with our remote teammates.

Developing a new taxonomy, either in-house or as a consultant, does benefit from some in-person time, but this can be for only small part of the time, so the taxonomist can be remote and travel occasionally. I had worked full-time remotely as a taxonomy consultant for a consulting company, Project Performance Corporation, based in McLean, VA, from of my home in Massachusetts. There was not really any need to be in the office, and I only went there for a few days of my initial orientation. For a consulting company, what is more important is some time on the client sites: interviewing stakeholders, leading group interactive workshops and discussions, presenting recommendations and getting input, and perhaps leading taxonomy testing. Client visits would average 3 days/nights every 2-3 months, with perhaps a total of 3 visits for a multi-month project. It was on these client visits when I got to see my teammates on the same project. This work-from-home with occasional onsite client visits was the same pattern when I contracted to other consultancies.

In another job, where I was responsible for the SharePoint taxonomy, I worked most days in the office, but my manager and teammates were in another office of the company across the country, so I was remote in another way.

In my own independent consulting, the larger projects also involve a couple of onsite visits, whether into my local city, Boston, across the country, or even to Europe, as was the case for a recent client. However, I have also taken on small projects with small budgets (i.e. no travel budget), especially for taxonomy review and revision projects, that have easily been done 100% remotely. If necessary, I have interviewed stakeholders remotely and led taxonomy testing activities remotely. Under current circumstances, even larger consulting projects can and will involve such work being done completely remotely. While not as ideal, it is certainly feasible. A good rapport with the client is important.
In between the roles of employee and consultant is that of contractors, who are full-time but temporary. The nature of much taxonomy work, as short-term intensive projects, lends itself to contracting. Contracting is traditionally done fully onsite, but for the experienced taxonomist it can also be done remotely, and I have done this, too. The option for remote work is important, because taxonomy design and development is such a specialized skill that a hiring company often cannot find a qualified and available taxonomist locally and does not want to pay for regular (i.e., weekly) travel and accommodations.

Matching qualified taxonomists to jobs in the right location has always been an issue, even for regular full-time work. As it becomes apparent that experienced taxonomists can do their jobs completely remotely, hopefully more employers will seek out taxonomist talent regardless of location and let even newly hired taxonomists work completely from home. Nevertheless, when travel is safe again, the occasional in-person office visit would be welcome.