Showing posts with label Taxonomy governance. Show all posts
Showing posts with label Taxonomy governance. Show all posts

Thursday, October 31, 2019

Managing Tagging with a Taxonomy



A lot of work can be put into designing and creating a taxonomy, but if it’s not implemented or used properly for tagging or indexing, then that work can be wasted. As the volume of content has grown, many organizations have invested in auto-tagging/auto-categorization solutions utilizing text analytics technologies. However, there remain many situations where manual tagging is still more practical. So, support for correct and efficient manual tagging needs to be considered. This is the topic of my upcoming presentation at the Taxonomy Boot Camp conference, in Washington, DC, on November 4.



A taxonomy can be designed to support manual tagging by including alternative labels (synonyms), hierarchical and associative relationships between terms, and term notes, to guide those doing the tagging to the most appropriate terms, even if these taxonomy features are not fully available to end-users in their user interface. It may be easier to have these features available in a customized manual tagging/indexing tool than it is to make them available in the end-user application. A taxonomy has more than one set of users, and the tagging-users need the full benefits a taxonomy can offer.
It’s very important to develop a customized policy for tagging with a taxonomy, so that it is used correctly and consistently. Any policy for tagging or indexing should include both rules and recommended guidelines. Examples of policy topics include:

  • Criteria for determining topic or name relevancy for tagging
  • Depth and level of detail of tagging
  • Comprehensiveness of aspects (what, who, where, when, how, why, etc.)
  • Required term types/facets (and any dependencies)
  • Number of terms (of each type) to tag
  • Tagging of certain terms in combination (e.g.: a parent/broader term in addition to its narrower/child term)
  • Other types of metadata that must be entered

It’s often not enough to just provide people with a policy document. Some degree of training on proper tagging can be very beneficial. In a current SharePoint taxonomy project, one of the users who tags uploaded documents said to me, “The problem is that we have not been trained. We are guessing.” Policy and guidelines should initially be delivered as a presentation (live or web meeting) to allow for questions and answers.

With large volume tagging, the initial tagging should be reviewed and feedback should be provided. This is the case for both new and experienced indexers. Even experienced indexers need to become familiar with the content and learn the policies and guidelines that are particular to the organization and project. In a recent taxonomy project that involved indexing hundreds of articles by a professional indexer, even the professional indexer’s initial indexing was reviewed to make sure it was as thorough and accurate as required.

Finally, there needs to me a method of communication and feedback between those doing the tagging and the person (taxonomist) who is managing the taxonomy, which is a controlled vocabulary, after all. The taxonomist should inform those tagging of new terms and changed terms, especially if they are high-profile terms, and may also provide tips for tagging new and trending topics. Meanwhile those doing tagging need a method to contact the taxonomist to request clarifications or the addition of new terms. This could be by email, but collaboration workspaces may also work well.  While I, as a consultant, do not stay on as tagging continues, I like to be available at the start of tagging with a new taxonomy, to answer indexing questions, something I did just this past month on my most recent consulting project.



Saturday, September 30, 2017

Vocabulary Management Issues



Issues in Vocabulary Management” is the latest Technical Report (TR-06-2017) published by the National InformationStandards Organization (NISO), approved on September 25, 2017. I had the honor of serving on its working group, specifically on its subgroup for Vocabulary Use/Reuse.

The most significant NISO publication for controlled vocabularies is ANSI/NISO Z39.19-2005 (R2010) Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies, which is referenced several times in TR-06. ANSI/NISO Z39.19 focuses on how to design and create controlled vocabularies (especially thesauri and taxonomies), whereas TR -06 addresses issues in the use of controlled vocabularies. Furthermore, as a Technical Report, rather than a Standard, this 49-page document does not contain requirements, but rather serves an informative purpose. It does have a page of recommendations, though, which are for a vocabulary’s definition and attribute types, its best practices for documentation, and its licensing or provisions for use and reuse.

Over time, the need to create new controlled vocabularies from scratch diminishes, as more vocabularies come into existence, especially those that are made available for sharing or licensing (see my blog post Directories and Databases of Published Controlled Vocabularies) but the need to maintain, revise, and reuse them grows, so this Technical Report serves a valuable role.

What are the “issues” in vocabulary management? They could vary, based on the organization and implementation, but this document considers three areas of

  • Vocabulary use and reuse, dealing with permissions, licenses, maintenance, versioning, extending and mapping vocabularies.
  • Vocabulary documentation, dealing with governance issues and how to document vocabulary properties.
  • Vocabulary preservation, dealing with issues of abandoned or “orphaned” vocabularies, which is especially the case of vocabularies developed by nonprofit organizations which have lost their funding to maintain them.

These issues are relevant to both proprietary controlled vocabularies, which may be reused through licensing agreements, and publicly available vocabularies, which are shared and reused increasingly through linked data on the web, or more specifically the Semantic Web and the Linked Open Data environment.  For publicly available or open vocabularies there are also the issues of simply finding or discovering suitable and sustainable vocabularies and evaluating them and then the communication between the vocabulary owner and user.

TR-06 takes a somewhat broader view of “vocabularies,” not just “controlled vocabularies,” but also including ontologies, unstructured term lists, terminologies, synonym rings, etc. I explored these differences and definitions in detail in my blog post Vocabularies and Controlled Vocabularies, which I wrote shortly after starting work on the NISO working group. The vocabularies of concern of TR-06 also include element sets, which comprise metadata properties/fields and not merely the controlled vocabulary terms/values within those properties.

TR-06 does not seem so much as a “technical report.” It also includes several real-life examples and use cases. To a certain extent, it explains by example.  Appendices include a glossary of terms with extensive definitions; a descriptive list of vocabulary directories, repositories or collections (something that I worked on); a list of free and open vocabulary tools (far more extensive than those I described in a previous blog post Free Taxonomy Management Software); and a list of additional resources with links, besides its bibliography, making this quite a valuable resource.

TR-06 “Issues in Vocabulary Management” will now be added to my list of recommended resources for controlled vocabulary and taxonomy management, and I hope that many of those who manage taxonomies will take a look at it.

Monday, December 9, 2013

Taxonomy Governance

Recently I was asked to speak on a panel on taxonomy governance, so this gave me an opportunity to reflect more on the subject. "Metadata Enhancement for Improved Content Management - Taxonomies and Governance" was the title of a panel I spoke on at the Gilbane Conference 2013: Content and the Digital Experience in Boston on December 3.

When I had first heard of "governance" with respect to knowledge management and taxonomies, in 2005, it did not sound like a subject of interest to me. Perhaps I was thinking of it in terms business process management in general, which is not my field. Over the years I have come to realize that governance is a very important part of any taxonomy, and while governance can be limited to the governing the taxonomy itself it can extend to other areas that are related to the taxonomy, such as indexing and content management. Most significantly, though, there is a synergy or dualism of taxonomies and governance: to be effective taxonomies must be governed, yet the existence of a taxonomy itself is a form of governance.  A taxonomy, after all, is a kind of controlled vocabulary, and “controlled” means governed. It's better to describe what taxonomy governance entails than to try to define it. Taxonomy governance comprises the policies, procedures, and documentation for the ongoing management and use of taxonomy. 

My main points in my brief presentation were:
  • Governance process begins when taxonomy development begins.
  • Each taxonomy is unique and has its own governance policy.
  • Governance includes both:
    • Documented editorial policies
    • Taxonomy management procedures and responsibilities
  • There are minimal guidelines to a taxonomy when it is started.
  • Decisions reached to questions as they come up in the process are documented and eventually become policy.
  • Taxonomy policy/guidelines includes both:
    • Taxonomy specifications, style and maintenance
    • Taxonomy usage and indexing/tagging/categorization policy (manual or automated)
Reflecting on the different taxonomy jobs I have had and projects I have worked on, taxonomy governance has taken many forms beyond the obvious of documenting the taxonomy editorial policies. Even though I did not hear of taxonomy governance until I had been working for years with taxonomies, I actually had been involved with governance for many years prior, just not by that name. My first job working with taxonomies (called then controlled vocabularies) was with the title of Vocabulary and Quality Management Specialist. In addition to maintaining the controlled vocabularies according to prescribed procedures, my duties included writing guidelines for the indexers using the vocabularies, especially for new topics and current events, and checking the published content for possible vocabulary-related quality issues. At my next employer, a developer of search software with built-in taxonomies, documenting how to create the taxonomies in a consistent style was simply a part of the documenting how to use the software. Later, on an assignment with a consulting firm, on ongoing contract involved making regular updates to ecommerce client's product taxonomy, following a certain procedure and workflow that was tracked in SharePoint. Finally, in more recent years as an independent taxonomy consultant, I have made sure that taxonomy editorial policies and maintenance guidelines are always a part of my project plans.

When a taxonomy project is short on time or budget, there may be a temptation to skip the governance documentation and planning. But in the long term, that will cost more. Time will be wasted by the taxonomy editors going back through old emails to try to find out what was decided when individual questions came up. Taxonomy editors will also waste time having to redo some of their work, after realizing that they were not following a consistent style or policy. Finally, and most crucially, lack of governance will likely result in an inconsistently developed taxonomy, which in turn leads to inconsistent indexing/tagging, no matter the method used. Then the main purpose of the taxonomy is defeated.

Taxonomy governance might not be as hot a topic as it was a few years ago, but that's only because it has become standard, accepted practice. Yet there is still a lot that an organization owning a taxonomy can learn about governance in the form of best practices and case studies. While organizations may not want to share their taxonomies, as intellectual property, hopefully they will share their experiences and tips on taxonomy governance.



Tuesday, January 22, 2013

Taxonomy Management Consulting

I recently wrote an article on taxonomy management for the online magazine FreePint. By “taxonomy management” I mean taxonomy maintenance, governance, and long-term planning.  I’m not going to repeat that article here, because you can look it up. The short version is available without a subscription: “The Care and Feeding of Taxonomies: Taxonomy Management.” In summary, in the long version I discussed:
  • The reasons for managing a taxonomy
  • The distinction between taxonomy development and taxonomy management
  • The parts of an organization responsible for taxonomy management
  • Factors in selecting a taxonomy management software system
  • Components of taxonomy editorial policies
  • Components of taxonomy maintenance procedures and a governance plan

Writing this article got me thinking about the role of taxonomy consulting in taxonomy management. As more and more taxonomies get created, over time, the need for new taxonomy creation may diminish, while the need for better taxonomy management increases. This should be good news for those in the taxonomist profession, especially for those who serve as in-house staff taxonomists. As for those of us who are taxonomy consultants, there is still a role, just a slightly different one.

The design and creation of a new customized taxonomy is an appropriate task for an external consultant because it:
  • is a limited-term project that needs extra assistance while existing staff probably lacks the time
  • requires a specialized skill that perhaps no one on staff has
  • can benefit from an external point of view that is not biased, but can appreciate the perspectives of various users.
The ongoing maintenance of a taxonomy, on the other hand, is best suited for an internal staff taxonomist or information specialist, who:
  • can be immediately responsive to changing needs or circumstances
  • is familiar with the subject matter of the organization when it comes to additions or changes of highly specific taxonomy terms
  • can devote at least a little time each day or week as needed, but the time can be flexible.
Consultants still have a role in maintenance. They can study the issues and write the taxonomy editorial policies, indexing policies, maintenance plans, governance plans etc. In fact, this is where taxonomy consultants really serve as consultants, and not merely as taxonomy designers and initial developers. Sometimes I think the designation of “consultant” is a bit of a stretch for someone spending most of their time actually building taxonomies. On all of my taxonomy projects, however, I also do provide advice and suggestions, so do some consulting all along. Taxonomy management, though, relies more heavily on actual consulting services.

Even though the needs of many organizations are shifting from taxonomy design and creation to taxonomy maintenance and revision, there exists a lot more information (books, articles, workshops, presentations, etc.) on taxonomy design than on taxonomy management. The relative lack of written sources on taxonomy management is another reason why a taxonomy consultant can be especially helpful. Finally, like taxonomy design, taxonomy management plans and procedures also need to be tailored to the circumstances of a specific organization.

When I create a taxonomy, I take a personal interest in it and hope that it will have a long useful life, so I want to create taxonomy guidelines and maintenance plans that are part of taxonomy management to ensure that my good work is kept up to date. If I don’t create an organization’s taxonomy, I am still just as interested in providing guidance for improving and maintaining what taxonomy exists, because my ultimate interest in taxonomies is seeing them get used and being useful.