The Accidental Taxonomist: Classification schemes

Showing posts with label Classification schemes. Show all posts

Saturday, February 24, 2024

Faceted Classification and Faceted Taxonomies

I have argued before that a taxonomy is not the same as a classification system, despite the original meaning of the word taxonomy as a system for classification. (See the blog post Classification Systems vs. Taxonomies.) Modern taxonomies that are used to support information management and findability are more similar to information retrieval thesauri and subject heading schemes than they are to classification systems. Another type of classification, the method of “faceted classification,” however, does apply to types of taxonomies. I would not consider “faceted classification” as exactly a synonym, though, to “faceted taxonomy,” as not all faceted taxonomies are the same.

What is faceted classification?

Facets for jobs

Facet means face, side, dimension, or aspect. In this sense, facets are meant to mean aspects of classification. A diamond, an object, or a digital content item is multi-faceted. A digital content item (text document, presentation, image, video, etc.) has multiple informational dimensions or aspects to it and thus multiple ways to be classified.

Classification is about putting an item, such as a content item (document, page, or digital asset) into a class or category. If it’s a physical object (a book) it goes into a shelf of its class. In faceted classification, an item cannot physically be in more than one place, but it can still be “assigned to” more than one class. So, while the book itself can be on only one shelf, the record about the book can be assigned to more than one class.

Faceted classification assigns classes/categories/terms/concept from each of multiple facets to a content item, allowing users to find the item by choosing the concepts from any one of the facets they consider first. Different users will consider different classification facets first. Users then narrow the search results by selecting concepts from additional facets in any order they wish, until they get a targeted result set meeting the criteria of multiple facet selections. The user interface of faceted classification is sometimes referred to as faceted browsing.

History of faceted classification

The idea of faceted classification as a superior alternative to traditional hierarchical classification, whereby an item (such as book or article) can be classified in multiple different ways instead of in just a single classification class/category, is not new. The first such faceted classification was developed and published by mathematician/librarian S.R. Ranganathan in 1933, as an alternative to the Dewey Decimal System for classifying books, called Colon Classification (since the colon punctuation was originally used to separate the multiple facets). In addition to subject categories, it has the following facets:

Personality – topic or orientation
Matter – things or materials
Energy – actions
Space – places or locations
Time – times or time periods

Although it was not adopted widely internationally due to its complexities in the pre-digital era, colon classification has been used by libraries in India.

In the late 20^th century, digital library research systems based on databases enabled faceted classification and search, with different fields of a database record represented in different search facets. Users interacted with through an “advanced search” form of multiple fields. Faceted classification and browsing gained widespread adoption with the advancement of interactive user interfaces on websites and in web applications in the late 1990s and early 2000s. Thus, facets started being displayed in more user-friendly ways that were no longer “advanced.”

Structure of facets

It’s not necessary to follow Ranganathan’s suggested five facets, but that’s a good way to get thinking about faceted classification. Another way to look at faceted classification is to consider a facet for each of various question words: What, Who, Where, When

What kind of thing is it – content type
What is it primarily about - subject
Who is it for or concerns – audience or user group
Where is it for/applicable, or where it depicts (media) – geographic region
When it is about – event or season (not date of creation, which is administrative metadata, instead of a taxonomy concept)

The additional question words of “why” and “how” are relevant in some cases, but less common. An individual content item typically does not address all of these questions, but usually addresses more than one. When creating facets, most of the facet types should be applicable to most of the content types.

Another good way to think about faceted classification is to put the word “by” after each facet, to suggest classification and filtering “by” the aspect type. A logical and practical number of facets tends to be in the range of three to seven.

A standard feature of facets is that they are mutually exclusive. A concept/type belongs to only one facet. This is typical practice for the design of classification systems. The difference is that in faceted classification it is merely the concept/type/term that belongs to just one facet, not the content item or thing itself that would belong to only one classification in traditional classification systems.

When a faceted taxonomy is not for classification

The design, implementation and use of facets to construct or refine searches has become so popular that it is no longer used just for classification aspects. Rather, a faceted taxonomy design may be used for any faceted grouping of concepts for search or metadata types that are relevant for the content and users.

Faceted classification is intended to classify things that share all the same facets. For example, all technical documentation content has a product, feature, issue, and content type, so these are faceted classifications. But with more heterogeneous content, facets are not universally shared. While the facets may still be useful tool, it would be best not call it faceted classification when facets are applicable to only some content types.

While faceted classification tends to be quite limited in the number of its facets, non-classification faceted taxonomies, whether based on subject types or separate controlled vocabularies, could result in a rather large number of facets.

Faceted taxonomies that would not be considered faceted classification include those where multiple facets are created for organizing and breaking down subjects or when multiple facets are created for reflecting multiple different controlled vocabularies. These faceted taxonomies stretch the meaning of “facet,” since the facets are not necessarily faces, dimensions, or aspects, but simply “types” suitable for filtering.

Facets for organizing subjects

In faceted classification we assign an object or content item to multiple different classes. However, for classification, these classes are relevant to the content item as a whole. This contrasts with indexing or tagging for subjects or names of relevance that occur within a text or are depicted within a media asset. These names and subjects can be grouped into facets for filtering/limiting search results, without being about the “classification” of the content item. This is common for specialized subject areas. Faceted taxonomies provide a form of guided navigation and are easier to browse and use than deep hierarchical taxonomies, so a large “subject” taxonomy could be broken down into specific subject-type facets.

Examples of specific subject-type facets include:

Organization types
Product types
Technologies
Activities
Industries
Disciplines
Job roles
Event types
Topics

The “Topics” facet is then used for the leftover generic subject concepts that do not belong in any of the other specialized facets. Unlike faceted classification, each facet is applicable to only some content items.

Any content item could be tagged with any number of concepts from any number of these facets. The facets make it easier for user to find taxonomy concepts and combine them. But the facets are not for “classifying” the content.

While faceted taxonomies should also ideally be mutually exclusive, in contrast to the principle of faceted classification, the occasional exception of a concept belonging to more than one subject-type facet (question word of “What”) does not create a problem in search. For example, the same concept Data catalogs, could be in the facet Product Types and Technologies, as long as this type of polyhierarchy is kept to a minimum to avoid confusion. This would not be considered a case of classic polyhierarchy, because it’s not simply a matter of different broader concepts, but rather different facets or concept schemes. It is an attempt to address a different focus or approach to the topic that results it being in more than one facet, offering an additional starting point for searchers.

Facets for organizing controlled vocabularies

Faceted filters/refinement may be based on different controlled vocabulary types: one or more of term lists, name authorities, and subject thesauri/taxonomies. The “facets” are based on how the set of multiple controlled vocabularies is organized rather than based on “aspects” of the content.

Facets could be used for any controlled vocabulary filters that are logical, such as:

Named people (mentioned/discussed)
Organizations (mentioned/discussed)
Products/brands (mentioned/discussed)
Divisions, departments, units (mentioned/discussed)
Named works/document titles (mentioned/discussed)
Places (mentioned/discussed)
Topics (mentioned/discussed)

Because these facets reflect controlled vocabularies of concepts used to tag content for relevant occurrences of the subject/name and not for classification of the content, this kind of faceted taxonomy would not be considered faceted classification. There could, however, be additional faceted classification types, such as content type.

The Topics facet could contain a large hierarchical taxonomy or thesaurus. As such, this faceted search/browse structure, may not even be considered a “faceted taxonomy,” but rather merely a faceted search interface to a set of taxonomies. Thus, there is even a nuanced difference between a faceted browse UI that utilizes at taxonomy (among other controlled vocabularies), and a “faceted taxonomy.”

Facets for heterogeneous content

Finally, whether a faceted taxonomy is considered an implementation of faceted “classification” or not may depend on the context and type of content. If the content is homogeneous and all items share the same facets, then it may be considered faceted classification, but if the content is heterogeneous, and the facets are only relevant to some content, then it would not be considered classification.

Consider the following example of specialized subject-based facets for the field of medicine:

Diseases or conditions
Body parts (anatomy)
Sign and symptoms
Treatments
Patient population types

If all the content comprised just clinical case studies, then these facets actually could be considered faceted classification, since they all apply to nearly all the content and are aspects of the content. The content is classified by these facets. On the other hand, if the content dealt with all kinds of documents that had something to do with health or medicine, then these facets would not be for classification of the content but rather just for grouping of subjects for search filters.

When faceted classification is not a taxonomy

Attributes for computers

Finally, I would not consider all faceted structures to be faceted taxonomies.

Taxonomies are primarily for subjects and may include named entities. Content types/document types may also be included in the scope of taxonomy. There exists additional metadata that may be desired for filtering/refining searches that is out of scope of a definition of taxonomy. This includes date published/uploaded, file format, author/creator, document/approval status, etc. If it is important to the end users, these additional metadata properties could be included among the browsable facets and be considered classification aspects.

Attributes are a form of faceted classification, but a set of attributes is not really a faceted taxonomy. Often ecommerce taxonomies are presented as examples of faceted taxonomies. In fact, ecommerce taxonomies tend to be hierarchical, as they present categories and subcategories of types of products for the users to browse. At lower, more specific levels of the hierarchy, the user then has the additional option to narrow the results further by selecting values from various attributes that are shared among the products within the same product category. These include color, size/dimensions, price range, and product-specific features. I would not consider numeric values to be a taxonomy, but some attributes, such as for features, are more within the realm of taxonomies. Whether these should be called facets or attributes is a matter of debate. More about attributes is discussed in my past blog post “Attributes in Taxonomies.”

Conclusions

Not all faceted taxonomies are faceted classifications, but some are. Not all faceted classifications are taxonomies, but some are. The differences are nuanced, and end-users may not care nor need to know these naming distinctions, as long as the taxonomist should. Having a deep understanding of facets helps taxonomists and information architects design the facets better. The goal is to serve the users with the most suitable faceted design to serve their needs and accommodate the set of content.

Friday, December 30, 2022

Taxonomy Definition

I usually explain that a taxonomy is a structured kind of controlled vocabulary, which is list of terms (or concepts) usually used to tag content to aid in its retrieval. The structure can be hierarchical, faceted, or a combination. Other people have defined taxonomies for a general audience in more simplistic ways as a kind of hierarchical classification system. So, while a taxonomy has two main features (naming and structure), my preferred definition has focused on the controlled vocabulary and naming aspect, whereas other definitions focus on the hierarchical classification aspect of taxonomies. However, a taxonomy and a classification system are not necessarily the same. While it is understandable that a definition is simplified for a general audience, it should not be simplified to the extent of being misleading.

I have blogged previously on the differences between taxonomies and classification systems, so I won’t repeat all the differences again. The main point is that a classification system is generic and rigid and is intended to be used widely, such as the Dewey Decimal Classification for libraries, whereas a taxonomy tends to be customized for a particular use case and context and is flexible and undergoes changes.

Meanwhile, there are also a few well-known classification systems that are called “taxonomies,” such as the Linnaean taxonomy of organisms and Bloom’s taxonomy of educational objectives. These seem quite different from the information-retrieval type of taxonomy. The Linnaean hierarchical levels have names (Kingdom, Phylum, Class, etc.). The relationship of the hierarchical levels to each other are not all of the thesaurus standards: generic-specific, generic-instance, or whole-part. Rather, the Linnaean taxonomic relationship are generic-specific only, or more precisely that of member of class or subclass. Bloom's taxonomy has a completely different hierarchical model that does not follow thesaurus standards at all.

How does a taxonomy of concepts for information retrieval relate to a scientific taxonomy? They are similar, and the differences are not so great that there should be considered different meanings of the word “taxonomy.” If we consider that taxonomies are systems to name and organize things hierarchically, then a taxonomy for information retrieval, comprised of terms for tagging and retrieving content (documents, images, etc.), can be considered a taxonomy of a controlled vocabulary, in contrast to taxonomies of things, such as organisms. This is a slightly different perspective than to consider a taxonomy as a kind of controlled vocabulary, as I previously had. The following diagram illustrates a possible way to consider how information-retrieval taxonomies related to classification systems and controlled vocabularies.

Diagram showing that information taxonomies are at the interssection of classification systems and controlled vocabularies

Several kinds of knowledge organization systems are defined by their published standards. For thesauri, there are ANSI/NISO Z39.19 and ISO 25964. For terminologies, there is ISO/TC 37/SC 3 and other related standards. For ontologies, there is OWL (Web Ontology Language) from the W3C. There is no standard, however, specifically for “taxonomies” or even for “classification systems,” which is a reason why these remain difficult to define. The designations “classification system,” “classification scheme,” and “taxonomy” have been used interchangeably.

Wikipedia provides the definition at the entry for Taxonomy: “A taxonomy (or taxonomical classification) is a scheme of classification, especially a hierarchical classification, in which things are organized into groups or types.” But then it goes on to say, “it may refer to a categorisation of things or concepts.” Thus, an information-retrieval taxonomy is a categorization of concepts (also called terms in a controlled vocabulary). It is not a classification system, since the goal is not to classify things, not even the things tagged with the taxonomy concepts, but rather to organize the set of concepts that have been identified as appropriate for tagging and retrieving a set of content.

Sunday, February 9, 2020

Classification Systems vs. Taxonomies

Is a taxonomy the same as a classification scheme or system? Or, to put it another way, is a classification system, such as the Dewey Decimal System, a kind of taxonomy? Both of these kinds of knowledge organization systems have the feature of arranging topical terms in a hierarchy of multiple levels, without having related-term relationships or necessarily synonyms/nonpreferred terms, which are features of thesauri. So, it appears as if the only difference is that classification systems have some kind of notation or alphanumeric code associated with each term, and taxonomies do not. The differences, however, are greater than that.

Classification systems

The codes/notations in classifications are not merely shortcut conveniences. They represent a way to divide up the area of knowledge into broad classes, sub-classes, sub-sub-classes, etc. The codes/notations are not an after-thought but are planned from the beginning of the design of a classification system.

The classification is comprehensive; everything in the subject domain is covered with a classification code + label. There is often not a lot of room for expansion, except for a few unused sub-unit codes in each area for new topics. The word classification means to put into a predefined class or grouping. The approach to classification is thinking “where does this go?” (Digital documents may go into more than one classification.)

Classification systems are not just used in libraries, but in corporate settings too, such as for research literature or detailed manufacturing product catalogs. The standard for defining knowledge organization systems for interoperability on the web, the Simple Knowledge Organization System (SKOS), developed by the World Wide Web Consortium (W3C), recognizes classifications systems, by having a designated element for “notation.”

Taxonomies

A taxonomy is a kind of knowledge organization system that has its terms hierarchically related to each other. The starting point in creating a taxonomy might be a few top terms or facets, but then the focus of taxonomy development is on the specific terms needed, rather than the division of a domain into classes and subclasses, etc. What this means is that the terms do not have to comprehensively cover the subject domain in an abstract manner. Rather the terms have to “cover” the topics appearing in the body of content to be tagged with the taxonomy.

The taxonomy is used for tagging or indexing, not for classification or cataloging. So, rather than thinking where (into what class) does this document go, the question is, what is/are the main topic(s) of this document. The topics might not fall into neat balanced hierarchies. For example, an intranet taxonomy might have a term for Temporary employees, because there are some human resources policies dealing with this topic specifically, but have no term for Full-time employees, since that is the default, and the term would not be useful (and likely inconsistently tagged).

Taxonomies vs. Classification Systems Comparison Table

Different mindsets

Lumpers and splitters are historically two opposing viewpoints in categorization and classification: whether you "lump" items into large categories, focusing on the similarities, or "split" items into more smaller categories, focusing on the differences. Of course, there is often a combination of both approaches, but it is my feeling that the design of modern taxonomies tends to involve more lumping, whereas the design of classification systems has involved more splitting.

One of the challenges of working with subject matter experts (SMEs) in building a taxonomy is that SMEs, as experts in their domain, may tend to think of how to classify their domain, and propose a taxonomy that resembles a classification system, even if it lacks the codes/notations. So, it’s very important to provide precise guidelines to SMEs contributing to a taxonomy, explaining that the terms are intended for tagging common topics that appear in the content and are for limiting/filtering search results, and that full classification is not necessary.

Students of library science may also tend to think of classification systems as serving for taxonomies. They learn about classification systems when they study cataloging, and subject cataloging is also about where the book or other library material belongs (often literally, on the shelf). So, even librarians need training on taxonomies and the taxonomy mindset if they want to become taxonomists. I will be giving a taxonomy workshop at the Computers in Libraries conference in March, so I will be sharing these ideas with those who attend.

Thursday, December 31, 2015

Vocabularies and Controlled Vocabularies

I have long considered a taxonomy as a particular, structured kind of controlled vocabulary. More recently, however, I have been hearing of “vocabularies” without the word “controlled” in front, although still for the purposes of information management and retrieval, which is cause to wonder: are controlled vocabularies and vocabularies the same thing or not?

Controlled Vocabularies

Definition
It’s the standards that drive the definitions and also the scope of meaning. “Controlled vocabularies” have been most authoritatively defined and scoped by ANSI/NISO Z39.19-2005 Guidelines for the construction, format, and management of monolingual controlled vocabularies. The Standard’s glossary defines it as: “A list of terms that have been enumerated explicitly.” Vocabulary control is an important part of the definition of controlled vocabularies, whereby synonyms are linked together, homographs are distinguished, and unambiguous concepts are defined or scoped.

Although not part of the standard’s name, ISO 25964 Thesauri and interoperability with other vocabularies (parts 1 and 2 published in 2011 and 2013) also defines controlled vocabularies in its glossary, where it states that a controlled vocabulary is a “prescribed list of terms, headings or codes, each representing a concept.” It is also noted: “Controlled vocabularies are designed for applications in which it is useful to identify each concept with one consistent label, for example when classifying documents, indexing them and/or searching them.”

Scope
As for what is included within the scope of controlled vocabularies, ANSI/NISO Z39.19-2005 states in its Scope section, on the first page that controlled vocabularies include:

Lists of controlled terms
Synonym rings
Taxonomies
Thesauri

In the ISO 25964, the scope of inclusion of controlled vocabularies is less clear. In the glossary definition for controlled vocabulary, it states: “Thesauri, subject heading schemes and name authority lists are examples of controlled vocabularies,” but a complete list of controlled vocabularies is not presented.

What is significant is that ISO 25964 does make a distinction between “controlled vocabulary” and just vocabulary. ISO 25964 describes more kinds of vocabularies, but then addresses the issue of vocabulary control in each. Types of vocabularies that ISO 25964 discusses as having vocabulary control are:

Thesauri
Classification schemes
Classification schemes for records management
Taxonomies
Subject heading schemes
Name authority lists

According to ISO 25964 part 2, terminologies and ontologies usually have vocabulary control, but vocabulary control is not a requirement. So, it can be inferred that most but not all terminologies (discussed in my last blog post) or ontologies are controlled vocabularies. Name authority lists are “usually controlled vocabularies” according to ISO 25964 part 2 (section 23.1.1). Synonym rings do not have vocabulary control (section 24.2.3).

Structured Vocabularies

Definition
There is another designation less commonly used of “structured vocabulary.” It appears in the name of the British Standard, BS 8723 Structured vocabularies for information retrieval – Guide. BS 8723 was published in five parts over 2005 – 2008, revising and expanding on the earlier BS and ISO standards for monolingual and multilingual thesauri, and, in turn, became the basis for the current ISO 25964 pair of standards.

ISO 25964 also includes “structured vocabulary” in its glossary, defined as an “organized set of terms, headings or codes representing concepts and their inter-relationships, which can be used to support information retrieval,” and goes on to note: “A structured vocabulary can also be used for other purposes. In the context of information retrieval, the vocabulary needs to be accompanied by rules for how to apply the terms.” Meanwhile, ANSI/NISO Z39.19-2005 does not mention “structured vocabularies.”

Scope
As for what is included within the scope of structured vocabularies, while that is not so clearly stated, it can be assumed, based on the title of BS 8723 Structured vocabularies for information retrieval – Guide, that the vocabularies included within the standard are all “structured vocabularies.” These are:

Thesauri
Classification schemes
Business classification schemes for records management
Taxonomies
Subject heading schemes
Ontologies
Authority lists

ISO 25964 seems to use “vocabularies” and “structured vocabularies” somewhat interchangeably. While the standard’s title refers to “thesauri and … other vocabularies,” its foreword states “ISO 25964-2 will cover interoperability between different thesauri and with other types of structured vocabulary, such as classification schemes, name authority lists, ontologies, etc.”

If all the types of vocabularies in part 2 are indeed considered as “structured vocabularies” then the scope of structured vocabularies would cover:

Thesauri
Classification schemes
Classification schemes for records management
Taxonomies
Subject heading schemes
Ontologies
Terminologies
Name authority lists
Synonym rings

The last two, however, might not be included as structured vocabularies. ISO 25964 part 2 says that name authority lists “may also be structured vocabularies” (23.1.1), implying that they are not always structured vocabularies, and it also explains that synonym rings are “not hierarchically structured.”

Vocabularies

The simple one-word designation of “vocabulary,” when used in the context of support for information retrieval, comprises all controlled and structured vocabularies, including those at the margin of the definitions or not always meeting their strict requirements of controlled or structured vocabularies, such as ontologies, terminologies, name authority lists, and synonym rings, along with other flat (unstructured) term lists.

Vocabularies, not necessarily controlled or structured, are also what are referred to in other frameworks or web contexts, such as SKOS (simple knowledge organization system) vocabularies, Semantic Web Vocabularies, and Linked Open Vocabularies.

What is interesting to note is what other topics are being discussed when the terms “controlled vocabulary” and “vocabulary” alone are used in ISO 25964 part 2 Interoperability with other vocabularies. Controlled vocabularies are discussed in context of entry terms, pre-coordination, post-coordination, near synonyms, and indexing. Vocabularies in general are discussed in context of equivalence mapping, interoperability, resources and authorities, registries, multilingual types, and management software/systems.

Conclusions

Taxonomies, thesauri, subject heading schemes, and classification schemes are both controlled vocabularies and structured vocabularies. Most controlled vocabularies are structured vocabularies, and almost all structured vocabularies are controlled vocabularies. But there are other vocabularies that do not meet the criteria of one definition or another, and to recognize and include them, especially as resources or for the mapping of terms, we refer to them as just vocabularies.

Tuesday, April 2, 2013

Taxonomies vs. Classification

A question had come up in one of my classes on how classification differs from taxonomies/thesauri. As part of an assignment to find thesauri on the web a student sought to find “how the Federal Government classifies its publications and was expecting to find a very elaborate Thesaurus … and instead found… the Superintendent of Documents classification system,” and so the student asked how that classification system fits into the scheme of definitions for taxonomies, controlled vocabularies, and thesauri. That I will attempt to explain here.

We are familiar with classification schemes used to catalog and locate books and other materials in libraries, such as the Dewy Decimal system or, for academic libraries, the Library of Congress Classification (letter-based call “numbers”). In addition to the U.S. federal government’s “Superintendent of Documents” classification system, many other national governments an international organizations also have their own document classification schemes, and states and provinces may have modified versions. There are also classification systems for industries, such as the NAICS (North American Industrial Classification System) codes. Corporations with large volumes of documents may have their own internal document classification systems.

I sum up the differences between classification schemes and taxonomies/thesauri as follows:

Classification:

used for books, monographs, documents, reports, contracts, or other media
developed for the classification of physical items for their location on shelves, drawers, or filing cabinets and physical file folders
based on alpha-numeric codes
involves assigning an item only one classification code
manually assigned to each item
classification codes may include additional information, such as date, title, author, or publishing department information within the same classification code
rarely gets changed (due to the pre-established numeric code hierarchy)
helps document managers and librarians organize documents and helps users locate pre-identified documents and materials

Taxonomy/Controlled Vocabulary/thesauri:

used for articles, images, electronic files, paragraphs or sections of text if separated out as digital content units
used primarily in online/digital space
based on descriptive words and phrases (terms). Codes, if any, are secondary.
involves assigning an item multiple taxonomy terms
manually or automatically (auto-tagging, auto-classification, etc.) assigned to content items
taxonomy terms restricted to subject information (not to include date, title, author, publishing department, etc.)
can easily be revised and updated
helps users identify which content items they want

Another way to think of the comparison:
Classification is for: where to put things/where does this document or item go.
Taxonomy is for: how to describe content/what is this text, image, or other media about.

So, while both classification and taxonomy are related and are within the realm of information science, they are really quite different. Since they serve different purposes, they can actually co-exist and both be applied to the same corpus of documents. Libraries utilize both at the same time: a classification system (the Dewy Decimal or Library of Congress Classification call numbers on books and media) and a form of a taxonomy in the catalog subject headings (usually Library of Congress Subject Headings, which are not to be confused with Library of Congress Classification).

Taxonomy and classification may each involve different people, too: catalogers for classification and taxonomists for taxonomies. While some information professionals may do both, you cannot assume that all catalogers know how to create taxonomies or that all taxonomists understand classification. There is, of course, a larger and growing need for taxonomies, in contrast to classification and cataloging systems, as more content migrates online. Furthermore, taxonomies are more adaptable to change and thus in need of continual maintenance, in comparison to the rather static classification systems. Many catalogers are taking an interest in learning about taxonomies these days.

Taxonomists who understand something about classification can also put that knowledge to use. There are many large corporations and agencies with documents organization by customized classification systems, which are now migrating over to dynamic online content/document management and taxonomies. The legacy classification systems then need to re-formed into (or replaced by) taxonomies, and then the legacy codes need to be mapped to the new taxonomy terms to ensure the continual retrieval of legacy documents. I did this kind of work as a consulting project for a large financial institution not long ago. There were thousands of legacy alpha-numeric codes, most of which combined both a document type attribute and a subject matter attribute into a single code, a typical feature of classification codes when a document can get only one code. A taxonomy, on the other hand may have one facet for document type and another facet for subject, and a document can be assigned multiple subject taxonomy terms in addition to the document type term.

As long as there are physical books, documents, and media, there is a need for classification, but if the entire content repository is digital, then taxonomies are the way to go.