The Accidental Taxonomist: Taxonomy maintenance

Showing posts with label Taxonomy maintenance. Show all posts

Wednesday, June 30, 2021

Taxonomy Management

As taxonomies become more common for information management and retrieval in all kinds of organizations and in various applications, the task of creating new taxonomies from scratch is less needed than the task of managing existing taxonomies. What is required for taxonomy management, however, might not be completely clear. I’ve written several posts on this blog which I tagged with the topic “Taxonomy maintenance,” but none tagged with “Taxonomy management.” That needs to be corrected. Taxonomy maintenance is part of the larger responsibility of taxonomy management.

Taxonomy management includes the following:

Taxonomy maintenance: adding concepts, merging concepts, editing select labels, adding alternative labels, adding relationships, etc. on an individual concept basis, to keep the taxonomy up to date, as new content and new concepts are introduced and terminology changes. These changes may arise from suggestions from those doing tagging, proactive review of new content and new trends, periodic review of search logs, and periodic text analytics of content. This is an on-going task, that can be done by one ore more taxonomy editors, including those who are subject matter experts. In such cases, the taxonomy-editing work of non-taxonomists should be reviewed by a taxonomist.

Taxonomy governance: developing taxonomy maintenance policies and documentation. This comprises documenting the taxonomy type, features, purpose, ownership, use, etc., and documenting how the taxonomy should be updated to keep its style consistent, including the criteria for adding new concepts to the taxonomy. Taxonomies should be documented when they are created, but sometimes they are not and need to be. Documentation may need to be updated from time to time.

Taxonomy tagging management: developing and updating tagging rules or policies, ensuring tagging quality (comprehensiveness and correctness), and updating or improving the taxonomy if tagging issues indicate it. Tagging can be manual, automated, or automated with human review. Periodic review of the tagging is a necessary task. Even when managing tagging is another individual’s responsibility, managing taxonomies is not completely separate from managing tagging, and this is an ongoing responsibility of the taxonomist who manages the taxonomy.

Taxonomy integration with end-user applications: including websites and web content management systems (CMSs), enterprise content management systems, digital asset management systems, search software, and other custom applications such as recommendation, personalization, and question answering. A taxonomy may be managed within an application, such as a specific CMS or SharePoint, but then it is usable only for that single application. As organizations increase the number of their information management systems, it eventually becomes clear that separate siloed taxonomies are not a good idea, and a single taxonomy should be centrally managed and ported or synced with the taxonomy management components of each tool. Taxonomy application integration involves both technical aspects, such as integrations with APIs, and nontechnical aspects related to user experience, such as considering how the taxonomy displays to the end-users and how they interact with it. Often, an existing taxonomy needs to be adapted to a new application.

Taxonomy review and revision: reviewing a taxonomy for quality standards and against best practices guidelines and checklists, and making general widespread improvements, such as: ensuring that concepts and their labels are clear and unambiguous and that concepts are sufficiently distinct in their meaning, adding alternative sufficient labels (synonyms), ensuring that hierarchical relationships always follow the standards, adding polyhierarchy and associative relationships, changing the capitalization and plural style, ensuring that the hierarchy is not too detailed and deep in some areas. This task is undertaken by a taxonomist or taxonomy consultant only occasionally, especially if the taxonomy will undergo an extension or will be migrated to a new system.

Taxonomy extension: merging redundant taxonomies, integrating complementary taxonomies mapping/linking taxonomies or other vocabularies in the same domain to extend their use, or translating taxonomies to add additional languages. This could include merging or linking a taxonomy and a glossary or terminology or linking the custom taxonomy to an industry standard classification scheme that is familiar to users. Taxonomy extension could also involve adding semantics of an ontology model with custom relationships and attributes. This task is also undertaken by a taxonomist or taxonomy consultant only occasionally.

The inclusion of all of these tasks of taxonomy management requires a dedicated taxonomy/thesaurus management tool, as spreadsheets are insufficient, and the taxonomy editing module of a single application not only tends to lack certain taxonomy management features but will not serve the needs of enterprise-wide taxonomy management.

I will discuss this all in more detail in an upcoming Pool Party webinar “Taxonomy Management 101” on August 4.

Saturday, February 6, 2021

Who Should Create Taxonomies?

More and more organizations of various types and sizes are recognizing the benefits of information/content taxonomies, to make it easier to more accurately and quickly find information, be recommended information, and be able to formulate complex queries of data.

In many cases, however, where taxonomies are not central to the product/service of a company (such as e-commerce retail or information publishing) or function of an organization (such as research), the task of creating and maintaining a taxonomy is not big enough to justify hiring a professional taxonomist. Creating a taxonomy is a temporary project, and then updating it is often a part-time task, which could even be shared among several people.

Taxonomy creation should not be underestimated, however. It may appear easy to create a taxonomy, but it is not easy to create a good taxonomy. If a taxonomy is not well-designed it cannot serve its purpose well. You may as well rely on a search engine alone than try to utilize a bad taxonomy.

Not creating the taxonomy yourself

Some approaches to developing a taxonomy without a dedicated taxonomist include using existing taxonomies, creating a taxonomy by term extraction, or hiring a consultant.

Reusing existing taxonomies

To serve its purpose best, a taxonomy should be custom-created to serve its content, users, and system. An existing external taxonomy is usually not adequate. It may be suitable for limited scope of a geographic taxonomy, industrial classification, a list of organization names, a list of languages. More information about licensing taxonomies is in my blog post “Taxonomy Licensing” Even when using an existing taxonomy, there is still work to edit and adapt the external taxonomy, which requires taxonomy expertise

Creating a taxonomy by automatically extracting terms from content

Software, including some taxonomy management software, such as PoolParty, can extract candidate taxonomy terms from a body of content (documents or web pages) that is intended to be tagged with the taxonomy. This is an effective method to enhance a taxonomy, to add missing concepts and alternative labels (synonyms). However, this is not a practical way to start creating a taxonomy, which requires a logical structure. Taxonomy-creation expertise is still needed.

Hiring a taxonomy consulting or temporary contractor

This is a good idea. A consultant or contractor will provide a combination of guidance and actual taxonomy building, although a consultant tends to provide more guidance, and a contractor tends to do more taxonomy building. A contractor requires a certain time commitment, such as 3-6 months full-time, whereas there is lots of flexibility in engaging a consultant. After the consultant or contractor is finished, though, someone needs to maintain and update the taxonomy to the same specifications.

When a taxonomy is not very large, it may be more efficient and cost-effective to create it from scratch oneself without reusing an existing taxonomy or relying on a consultant or contractor, although getting a consultant to at least review the taxonomy might still be a good idea.

Taxonomy management as part of a role

What is much more common for an organization than to have a taxonomist is to have one or more positions where taxonomy management is part of the job description. Searches on web job boards return hundreds of job opening with “taxonomy” in the job description, whereas only a small fraction of them have taxonomy or taxonomist in the job title. Common job titles include: Content Designer, Content Manager, Content Strategist, Data Architect, Data Catalog…, Data Strategist, Digital Asset Manager, Digital Content…, Digital Librarian, Information Architect, Information Scientist, Knowledge Engineer, Knowledge Management…, Metadata Specialist, Product Manager, SharePoint Developer, Solutions Architect, etc. There are also positions more centered in marketing and in web development.

Often, though, the need for a taxonomy emerges at a time when a new position is not created, so an existing employee must take on the task. This common scenario is behind the title of my book and this blog, The Accidental Taxonomist. Those that take on taxonomy work may come from a wide variety of roles or departments including marketing for a website taxonomy, IT or human resources for an intranet taxonomy, IT for content/document management systems administration, and technical documentation/publishing. Knowledge management and metadata/data management are also good candidate roles for taxonomy management.

In situations where the taxonomy is used to manage and retrieve content in specialized subject areas, subject matter experts may also be involved in taxonomy creation, at least for the parts of the taxonomy that correspond to their expertise.

Not having sufficient taxonomy skills

In either case, whether taxonomy management was originally part of the job description or not, people who assume partial taxonomy responsibilities often do not have the skills. This is usually the case when a taxonomy project first arises. Even when someone is newly hired, successful applicants may not to meet all job description duties, such as taxonomy experience, especially if the skill is only a minor part of the job.

Related job skills may make it easier to created taxonomies, but without experience or training, one cannot simply create a good taxonomy. Related skills tend to be in the area of library/information science, indexing, information architecture, digital asset management, content management, records management, and possibly product management.

Librarians tend to have training in cataloging and classification, sometimes in thesaurus creation, and less likely in taxonomy creation. Taxonomies resemble classification schemes, but function differently, so it would be a mistake to model a taxonomy as a classification scheme. See my blog post "Classification Systems vs. Taxonomies." I had taught a continuing education course on taxonomies through a graduate school of library and information science for years, since MLIS graduates had not learned taxonomies as part of their degree program.

Information architects know how to organize information in a web user interface well, so they may have a good sense on how to structure a taxonomy at a high level. However, there are details and nuances of a large taxonomy, such as the development of synonyms/alternative labels, with which they may not have experience. Also, a taxonomy should not be confused with a navigation scheme, as explained my blog post "Navigation Schemes vs. Taxonomies."

Digital asset managers, content managers, and product managers know about the metadata management for their content, and taxonomies usually fit into the larger metadata scheme. However, their experience with taxonomy creation is usually limited to a subject area and the context and constraints of the system in which they are working. So, the very basic taxonomy skills that they develop may not be transferable to another system or another subject domain.

Subject matter or domain experts, including product managers, often play an important role in taxonomy development. From my experience in working with subject matter experts, though, they often tend to design more of a classification scheme for their domain and create taxonomy concepts that are too granular to be practical for end-using search and retrieval.

Where to learn taxonomy skills

There are many continuing education options to learn taxonomy creation, some through library/information science schools, some through professional associations, and some through commercial conference and training programs. I have been providing taxonomy training since 2007, through online courses, conference workshops, and corporate workshops, both in-person and virtual. I have been impressed with the diversity of backgrounds, job roles, organization types, and global locations of the workshop participants over the years.

I teach various online and in-person workshops. The latest information is on my website page Courses and Workshops.

Upcoming in fall 2021:

"Knowledge Engineering of Taxonomies, Thesauri, and Ontologies"
SEMANTiCS conference, Amsterdam, Monday September 6, 9:00 am - 12:30 Central European Time. Available both online and in-person.

"Taxonomy and Metadata Design"
Monday-Tuesday, October 11-12, 8:00am - 12:00pm EDT (14:00 - 18:00 CEST) each day (8 hours over two days)
Through Technology Transfer, Rome (with the availability simultaneous interpretation into Italian).

Saturday, September 30, 2017

Vocabulary Management Issues

“Issues in Vocabulary Management” is the latest Technical Report (TR-06-2017) published by the National InformationStandards Organization (NISO), approved on September 25, 2017. I had the honor of serving on its working group, specifically on its subgroup for Vocabulary Use/Reuse.

The most significant NISO publication for controlled vocabularies is ANSI/NISO Z39.19-2005 (R2010) Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies, which is referenced several times in TR-06. ANSI/NISO Z39.19 focuses on how to design and create controlled vocabularies (especially thesauri and taxonomies), whereas TR -06 addresses issues in the use of controlled vocabularies. Furthermore, as a Technical Report, rather than a Standard, this 49-page document does not contain requirements, but rather serves an informative purpose. It does have a page of recommendations, though, which are for a vocabulary’s definition and attribute types, its best practices for documentation, and its licensing or provisions for use and reuse.

Over time, the need to create new controlled vocabularies from scratch diminishes, as more vocabularies come into existence, especially those that are made available for sharing or licensing (see my blog post Directories and Databases of Published Controlled Vocabularies) but the need to maintain, revise, and reuse them grows, so this Technical Report serves a valuable role.

What are the “issues” in vocabulary management? They could vary, based on the organization and implementation, but this document considers three areas of

Vocabulary use and reuse, dealing with permissions, licenses, maintenance, versioning, extending and mapping vocabularies.
Vocabulary documentation, dealing with governance issues and how to document vocabulary properties.
Vocabulary preservation, dealing with issues of abandoned or “orphaned” vocabularies, which is especially the case of vocabularies developed by nonprofit organizations which have lost their funding to maintain them.

These issues are relevant to both proprietary controlled vocabularies, which may be reused through licensing agreements, and publicly available vocabularies, which are shared and reused increasingly through linked data on the web, or more specifically the Semantic Web and the Linked Open Data environment. For publicly available or open vocabularies there are also the issues of simply finding or discovering suitable and sustainable vocabularies and evaluating them and then the communication between the vocabulary owner and user.

TR-06 takes a somewhat broader view of “vocabularies,” not just “controlled vocabularies,” but also including ontologies, unstructured term lists, terminologies, synonym rings, etc. I explored these differences and definitions in detail in my blog post Vocabularies and Controlled Vocabularies, which I wrote shortly after starting work on the NISO working group. The vocabularies of concern of TR-06 also include element sets, which comprise metadata properties/fields and not merely the controlled vocabulary terms/values within those properties.

TR-06 does not seem so much as a “technical report.” It also includes several real-life examples and use cases. To a certain extent, it explains by example. Appendices include a glossary of terms with extensive definitions; a descriptive list of vocabulary directories, repositories or collections (something that I worked on); a list of free and open vocabulary tools (far more extensive than those I described in a previous blog post Free Taxonomy Management Software); and a list of additional resources with links, besides its bibliography, making this quite a valuable resource.

TR-06 “Issues in Vocabulary Management” will now be added to my list of recommended resources for controlled vocabulary and taxonomy management, and I hope that many of those who manage taxonomies will take a look at it.

Sunday, April 23, 2017

Taxonomy Term Specificity

One of the challenges in creating or editing taxonomies is determining how specific the terms should be. This is a key issue in making a taxonomy customized for a certain implementation, which involves a unique set of content to be tagged/indexed and a certain set of users. Highly specific terms tend to be the consequence of deeper hierarchies. So, the decision of how specific the terms should be is also related to the decision of how many hierarchical levels of depth the taxonomy should be. Taxonomies that are organized into multiple facets, on the other hand, tend to have more limited hierarchy, if any, and terms that are not so specific.

Having taxonomy terms that are more specific than necessary inevitably means that there are more taxonomy terms than necessary. The larger taxonomy is more difficult to maintain both in currency and consistency. Terms that are more specific than necessary are also likely to be more specific than expected by the users and might get overlooked and not even used. If the taxonomy is searched, the users will not likely search for such highly specific terms. If the taxonomy is browsed, the users might stop at a higher-level broader term and be satisfied with that. Furthermore, users like to retrieve multiple results (content items or references) for a single search term, so that they can browse the list and evaluate what they want. Highly specific terms will match fewer content items, so retrieved results could comprise only one or two items per taxonomy term, which may not satisfy most users. Having a greater number of more specific terms can also lead to more inconsistency in the indexing/tagging, whether manual or automated.

Having taxonomy terms that are not specific enough means that each taxonomy term is indexed to a relatively large number of content items, and the users may have to scroll through multiple screens of returned results and look at multiple items to find what they really want. The availability of additional filters or facets can help limit the results, though. Having terms that are not specific enough also makes it more difficult for users to “discover” potential related topics of interest, whether the terms have “related-term”/”see also” relationships between them or whether “related” terms are suggested by shared tagged occurrence among content items.

Taxonomists sometimes refer to term specificity as “granularity” or a taxonomy being “granular.” There is the irony that, although the scope and meaning of specific terms is granular/narrow/small, the terms themselves are not small. The “granular” terms tend to be longer, more complex, multi-word terms. If combining multiple concepts into a single term, such terms might also be called "pre-coordinated" terms. Following are examples of specific, granular taxonomy terms from different specialized taxonomies:

Possessed object access systems (in an information technology taxonomy)
Fingerstick blood sugar testing (in a health care taxonomy)
Standard manufacturing overhead cost (in a business taxonomy)

The taxonomist typically creates specific/granular terms, based on the concepts of sample content to be tagged. There may be a document with the phrase in the title, an image with the phrase in its caption, a product with this description as its type, a department with the phrase in its name, etc. Obviously, source phrases would need to be edited to become well-formed taxonomy terms, but they may still be multi-word, complex terms. Creating a taxonomy from scratch usually involves a combination of a top-down and bottom-up approach in the development of terms and hierarchical relationships. The specific/granular terms are the result of the bottom-up component of taxonomy development.

Taxonomies available for license might be appropriate in their subject area and scope, but chances are that their terms get either too specific or not specific enough for different implementations. Thus, if you choose to license a taxonomy, make sure your license allows you to customize the taxonomy so that you can either delete terms that are too specific or add more terms, as narrower terms to existing terms, that are more specific to suit your content

Creating or deleting specific terms is also part of periodic taxonomy maintenance. If a term, which has no narrower terms, is heavily used in indexing, it might be time to “break it up” be creating a few more specific, narrower terms so that the large content set is indexed and retrieved with more specific terms for more manageable result numbers. If, over a period of time, a specific terms has been applied in indexing very few times, or not at all, it should probably be deleted. The deleted term can be changed to a variant/nonpreferred term/alternative label for an existing broader concept. The specificity of a taxonomy should match the specificity of the content being tagged with it, and this can change over time.

Friday, January 20, 2017

Orphan Terms in a Taxonomy

A taxonomy has hierarchical relationships between all of its terms, so one of the quality control checks on a taxonomy is to ensure that there are no “orphan” terms, which are terms that lack hierarchical relationships. One of the purposes of a taxonomy is for users to be able to navigate it (whether it is fully displayed or whether the links between only the selected terms are displayed), in order to find terms of interest. An orphan term, thus, cannot be found by browsing, only by searching.

Taxonomy/thesaurus management software can generate orphan term reports. However, as there are different kinds or definitions of taxonomies or thesauri, there are also different kinds or definitions of orphan terms. Certain definitions of orphans may be permitted, other kinds of orphans may be permitted in only certain kinds of controlled vocabularies, and some kinds of orphans are never permitted in any taxonomy or thesaurus.

Differences between taxonomies and thesauri

There are two main differences between strictly defined taxonomies and thesauri that have an impact on orphan terms.

A taxonomy has only hierarchical (broader-narrower) relationships between its terms, whereas a thesaurus has both hierarchical and associative (related-term) relationships between terms.
In a taxonomy, all terms belong to a single or limited number of hierarchies, each with a designated, broad-meaning “top term,” whereas in a thesaurus hierarchical relationships are created between terms merely as appropriate, without regard to any larger hierarchies or top terms. A taxonomy thus has a top-down inverted tree structure, whereas a thesaurus does not necessarily have an over-arching hierarchical structure.

Different kinds of orphan terms

The loosest and easiest to remember definition of an orphan term is a term which lacks a “parent”. In other words, the term has no broader term, but it may have other kinds of relationships to terms. A “top term” report of taxonomy/thesaurus management software will get this result, since all top terms are, by this definition, orphans.

An orphan term could also be defined as a term that has no hierarchical relationships, whether broader or narrower. In a thesaurus, such terms could have associative relationships only. In a taxonomy (lacking associative relationships), these terms then would have no relationships to other terms in the taxonomy.

At the strictest definition, an orphan term is defined as a term which lacks any relationships to any other term. This would be the same in a taxonomy or a thesaurus.
Finally, taxonomy/thesaurus management software may have the feature to allow you to define your own orphans, that is to designate a relationship type and then generate a list of terms that lack that relationship type to any other terms.

Which kind of orphans to avoid

Orphans defined merely as those lacking broader terms, are not necessarily a problem, since every taxonomy or thesaurus has top terms. For quality control, you would want to ensure that these parent-less “orphans” are indeed the top terms that you want. For a taxonomy, there are strict criteria for top terms. They must be broad-meaning categories under which are extensive hierarchical trees, perhaps even of a similar depth and breadth for each top term. For thesauri, the requirement for top terms are usually not strict, but it is still a good idea to review the top terms to ensure that there really is no appropriate broader term move them under.

An orphan report of the kind that indicates terms that lack any hierarchical relationship (narrower or broader) but may have associative (related-term) relationships is quite helpful when editing thesauri. It will depend on the thesaurus owner whether the policy should permit such “hierarchical orphans.” Generally, such orphans should at least be avoided and perhaps permitted in only exceptional circumstances.

Orphans defined as terms that lack any relationships to other terms in the taxonomy should not be permitted in any circumstance. They don’t serve the navigation feature of a taxonomy, as there is no way to find them without search. If a suitable broader term within the taxonomy cannot be found, then they may be out of scope of the taxonomy/thesaurus. Usually, though, such orphan terms are the results of taxonomist error. If the taxonomy management software permits duplicate terms, these orphans could be duplicates of synonyms/nonpreferred terms/alternative labels.

Resolving orphan terms

In the case of orphan terms that lack broader terms but are not obviously top terms, the taxonomist should search the taxonomy/thesaurus for a suitable broader term. If one cannot be found, careful consideration should be made whether a new term should be added that would both serve as a broader term for the orphan term but also have a suitable broader term of its own already in the taxonomy/thesaurus. If dealing with a thesaurus rather than a taxonomy, then it may be OK to leave the term without a broader term, but then the related-term relationships should be checked and possibly enhanced so that there are multiple related-term relationships.

Sometimes stretching the thesaurus rules for hierarchical relationships may be desired to provide a broader term to an orphan. This is generally acceptable in a taxonomy but not in a thesaurus. Following are examples of former orphan terms whose candidate broader terms are not 100% correct broader terms (the narrower term is not a kind of or a part of its broader term), but they are close, so these relationships could be made, even in a thesaurus. What follows in parentheses are theoretical broader terms which are not practical terms to create.

College applications BT College admissions (and not a BT of Applications)
Behavior problems BT Behavior (and not a BT of Problems)
Atmospheric composition BT Atmosphere (and not a BT of Composition)
Conflict termination (Military science) BT Wars (and not a BT of Termination)

Orphans that lack any relationships are usually the result of taxonomist error. Perhaps the taxonomist got interrupted and did not complete the process of relating a term and then forgot. In many cases these orphans should have been made as synonyms/nonpreferred terms/alternative labels. The taxonomist should run orphan reports frequently enough to remember whether the orphan term was intended to be a preferred or a nonpreferred name.

More examples of how to resolve orphan terms are in a PDF of a PowerPoint presentation “Managing Mature Taxonomies: Resolving Orphan Terms” I gave as an SLA Taxonomy Division webinar in December 2016.

Monday, December 9, 2013

Taxonomy Governance

Recently I was asked to speak on a panel on taxonomy governance, so this gave me an opportunity to reflect more on the subject. "Metadata Enhancement for Improved Content Management - Taxonomies and Governance" was the title of a panel I spoke on at the Gilbane Conference 2013: Content and the Digital Experience in Boston on December 3.

When I had first heard of "governance" with respect to knowledge management and taxonomies, in 2005, it did not sound like a subject of interest to me. Perhaps I was thinking of it in terms business process management in general, which is not my field. Over the years I have come to realize that governance is a very important part of any taxonomy, and while governance can be limited to the governing the taxonomy itself it can extend to other areas that are related to the taxonomy, such as indexing and content management. Most significantly, though, there is a synergy or dualism of taxonomies and governance: to be effective taxonomies must be governed, yet the existence of a taxonomy itself is a form of governance. A taxonomy, after all, is a kind of controlled vocabulary, and “controlled” means governed. It's better to describe what taxonomy governance entails than to try to define it. Taxonomy governance comprises the policies, procedures, and documentation for the ongoing management and use of taxonomy.

My main points in my brief presentation were:

Governance process begins when taxonomy development begins.
Each taxonomy is unique and has its own governance policy.
Governance includes both:
- Documented editorial policies
- Taxonomy management procedures and responsibilities
There are minimal guidelines to a taxonomy when it is started.
Decisions reached to questions as they come up in the process are documented and eventually become policy.
Taxonomy policy/guidelines includes both:
- Taxonomy specifications, style and maintenance
- Taxonomy usage and indexing/tagging/categorization policy (manual or automated)

Reflecting on the different taxonomy jobs I have had and projects I have worked on, taxonomy governance has taken many forms beyond the obvious of documenting the taxonomy editorial policies. Even though I did not hear of taxonomy governance until I had been working for years with taxonomies, I actually had been involved with governance for many years prior, just not by that name. My first job working with taxonomies (called then controlled vocabularies) was with the title of Vocabulary and Quality Management Specialist. In addition to maintaining the controlled vocabularies according to prescribed procedures, my duties included writing guidelines for the indexers using the vocabularies, especially for new topics and current events, and checking the published content for possible vocabulary-related quality issues. At my next employer, a developer of search software with built-in taxonomies, documenting how to create the taxonomies in a consistent style was simply a part of the documenting how to use the software. Later, on an assignment with a consulting firm, on ongoing contract involved making regular updates to ecommerce client's product taxonomy, following a certain procedure and workflow that was tracked in SharePoint. Finally, in more recent years as an independent taxonomy consultant, I have made sure that taxonomy editorial policies and maintenance guidelines are always a part of my project plans.

When a taxonomy project is short on time or budget, there may be a temptation to skip the governance documentation and planning. But in the long term, that will cost more. Time will be wasted by the taxonomy editors going back through old emails to try to find out what was decided when individual questions came up. Taxonomy editors will also waste time having to redo some of their work, after realizing that they were not following a consistent style or policy. Finally, and most crucially, lack of governance will likely result in an inconsistently developed taxonomy, which in turn leads to inconsistent indexing/tagging, no matter the method used. Then the main purpose of the taxonomy is defeated.

Taxonomy governance might not be as hot a topic as it was a few years ago, but that's only because it has become standard, accepted practice. Yet there is still a lot that an organization owning a taxonomy can learn about governance in the form of best practices and case studies. While organizations may not want to share their taxonomies, as intellectual property, hopefully they will share their experiences and tips on taxonomy governance.

Tuesday, January 22, 2013

Taxonomy Management Consulting

I recently wrote an article on taxonomy management for the online magazine FreePint. By “taxonomy management” I mean taxonomy maintenance, governance, and long-term planning. I’m not going to repeat that article here, because you can look it up. The short version is available without a subscription: “The Care and Feeding of Taxonomies: Taxonomy Management.” In summary, in the long version I discussed:

The reasons for managing a taxonomy
The distinction between taxonomy development and taxonomy management
The parts of an organization responsible for taxonomy management
Factors in selecting a taxonomy management software system
Components of taxonomy editorial policies
Components of taxonomy maintenance procedures and a governance plan

Writing this article got me thinking about the role of taxonomy consulting in taxonomy management. As more and more taxonomies get created, over time, the need for new taxonomy creation may diminish, while the need for better taxonomy management increases. This should be good news for those in the taxonomist profession, especially for those who serve as in-house staff taxonomists. As for those of us who are taxonomy consultants, there is still a role, just a slightly different one.

The design and creation of a new customized taxonomy is an appropriate task for an external consultant because it:

is a limited-term project that needs extra assistance while existing staff probably lacks the time
requires a specialized skill that perhaps no one on staff has
can benefit from an external point of view that is not biased, but can appreciate the perspectives of various users.

The ongoing maintenance of a taxonomy, on the other hand, is best suited for an internal staff taxonomist or information specialist, who:

can be immediately responsive to changing needs or circumstances
is familiar with the subject matter of the organization when it comes to additions or changes of highly specific taxonomy terms
can devote at least a little time each day or week as needed, but the time can be flexible.

Consultants still have a role in maintenance. They can study the issues and write the taxonomy editorial policies, indexing policies, maintenance plans, governance plans etc. In fact, this is where taxonomy consultants really serve as consultants, and not merely as taxonomy designers and initial developers. Sometimes I think the designation of “consultant” is a bit of a stretch for someone spending most of their time actually building taxonomies. On all of my taxonomy projects, however, I also do provide advice and suggestions, so do some consulting all along. Taxonomy management, though, relies more heavily on actual consulting services.

Even though the needs of many organizations are shifting from taxonomy design and creation to taxonomy maintenance and revision, there exists a lot more information (books, articles, workshops, presentations, etc.) on taxonomy design than on taxonomy management. The relative lack of written sources on taxonomy management is another reason why a taxonomy consultant can be especially helpful. Finally, like taxonomy design, taxonomy management plans and procedures also need to be tailored to the circumstances of a specific organization.

When I create a taxonomy, I take a personal interest in it and hope that it will have a long useful life, so I want to create taxonomy guidelines and maintenance plans that are part of taxonomy management to ensure that my good work is kept up to date. If I don’t create an organization’s taxonomy, I am still just as interested in providing guidance for improving and maintaining what taxonomy exists, because my ultimate interest in taxonomies is seeing them get used and being useful.