Friday, March 6, 2020

Taxonomies and the Digital Employee Experience (DEX)

Helping employees find information within their organizations is one the uses of taxonomies. Implemented in an ECM, SharePoint, or other Intranet platform, taxonomy terms can link users to desired content more precisely and comprehensively than by search alone. I wrote intranet taxonomies in a recent blog post “Intranet and ECM Taxonomies.” In the meantime I have gained a better appreciate for the efforts to improve the digital workplace after attending a conference on the subject, the IntraTeam Event, held March 2-4, in Copenhagen, Denmark (and also meets in Stockholm, Sweden, in the fall), but on by the company IntraTeam. Now in its 15th year, the conference is in the process of rebranding itself as the European DEX conference. DEX stands for digital employee experience. I’ve participated in a digital experience conference before, but that conference focused more on the customer digital experience than the employee digital experience.

I learned several things about digital employee experience, especially through the excellent keynote presented by James Robertson, consultant of Step Two Designs. James explained that DEX involves content, internal communications, support for corporate culture, collaboration and social tools, and a place for online tasks. He also emphasized that a digital workplace is the sum total of digital interactions within the workplace environment and not merely that between the staff and the organization. I also learned that good digital customer experience depends on good digital employee experience.

IntraTeam Event Copenhagen keynote presentation by James Robertson, March 3, 2020
IntraTeam Event Copenhagen, keynote talk, March 3, 2020
I’m not going to summarize the conference, because others have already done that, including Steve Bynghall, who wrote “Six takeaways fromthe IntraTeam 2020 Conference on DEX” and Fredric Landqvist, who wrote “The emerging digital Work Experience,” where he mentioned "One challenge still remains, Findability."

As for the role of taxonomies, they serve findability and can help employees find not only content but also online spaces where they can perform activities and collaborate. Another way to look at taxonomies in support of DEX is that taxonomies can and should be designed with the users needs and experience in mind. This is what gives taxonomy design for internal users an advantage over designing taxonomies for external users: we have access to the users and can talk to them about what they need and desire, and thus the taxonomy can be suited for the employee user experiences. It is typical in an internal taxonomy project to interview numerous users of the intranet, not merely about the content they create but also about what information they seek and what online tasks they perform. By contrast, externally facing taxonomy creation does not usually involve gathering any information directly from customers or other external visitors of the website. So, when creating an internal taxonomy, I ask employees what topics, document types, and Intranet pages the often look for and what do they most often use the intranet for.

I recently completed a taxonomy project for an organization’s SharePoint intranet and thus presented at this conference (in addition to a pre-conference workshop on taxonomies) on the subject of taxonomies for SharePoint. Questions from the audience afterwards focused on the issues of tagging with the taxonomy. Since a positive digital employee experience is important, I would advice not to make tagging mandatory for everyone, but rather delegate the responsibility to a couple of people within each business unit who have the interest and (with training) the aptitude for tagging. They may also take more of an active role in making suggestion for new terms or other improvements to the taxonomy. 

While I enjoyed the opportunity to travel to Copenhagen, I also hope to see DEX conferences in the United States. For now, there tend to be conferences on digital experience, but not focused on employees, and conferences on the digital workplace, but not focused on the experience of the employees.

Sunday, February 9, 2020

Classification Systems vs. Taxonomies

Is a taxonomy the same as a classification scheme or system? Or, to put it another way, is a classification system, such as the Dewey Decimal System, a kind of taxonomy? Both of these kinds of knowledge organization systems have the feature of arranging topical terms in a hierarchy of multiple levels, without having related-term relationships or necessarily synonyms/nonpreferred terms, which are features of thesauri. So, it appears as if the only difference is that classification systems have some kind of notation or alphanumeric code associated with each term, and taxonomies do not. The differences, however, are greater than that.

Classification systems

The codes/notations in classifications are not merely shortcut conveniences. They represent a way to divide up the area of knowledge into broad classes, sub-classes, sub-sub-classes, etc. The codes/notations are not an after-thought but are planned from the beginning of the design of a classification system.

The classification is comprehensive; everything in the subject domain is covered with a classification code + label. There is often not a lot of room for expansion, except for a few unused sub-unit codes in each area for new topics. The word classification means to put into a predefined class or grouping. The approach to classification is thinking “where does this go?” (Digital documents may go into more than one classification.)

Classification systems are not just used in libraries, but in corporate settings too, such as for research literature or detailed manufacturing product catalogs. The standard for defining knowledge organization systems for interoperability on the web, the Simple Knowledge Organization System (SKOS), developed by the World Wide Web Consortium (W3C), recognizes classifications systems, by having a designated element for “notation.”


A taxonomy is a kind of knowledge organization system that has its terms hierarchically related to each other. The starting point in creating a taxonomy might be a few top terms or facets, but then the focus of taxonomy development is on the specific terms needed, rather than the division of a domain into classes and subclasses, etc. What this means is that the terms do not have to comprehensively cover the subject domain in an abstract manner. Rather the terms have to “cover” the topics appearing in the body of content to be tagged with the taxonomy.

The taxonomy is used for tagging or indexing, not for classification or cataloging. So, rather than thinking where (into what class) does this document go, the question is, what is/are the main topic(s) of this document. The topics might not fall into neat balanced hierarchies. For example, an intranet taxonomy might have a term for Temporary employees, because there are some human resources policies dealing with this topic specifically, but have no term for Full-time employees, since that is the default, and the term would not be useful (and likely inconsistently tagged).

Taxonomies vs. Classification Systems Comparison Table

Different mindsets

Lumpers and splitters are historically two opposing viewpoints in categorization and classification: whether you "lump" items into large categories, focusing on the similarities, or "split" items into more smaller categories, focusing on the differences. Of course, there is often a combination of both approaches, but it is my feeling that the design of modern taxonomies tends to involve more lumping, whereas the design of classification systems has involved more splitting.

One of the challenges of working with subject matter experts (SMEs) in building a taxonomy is that SMEs, as experts in their domain, may tend to think of how to classify their domain, and propose a taxonomy that resembles a classification system, even if it lacks the codes/notations. So, it’s very important to provide precise guidelines to SMEs contributing to a taxonomy, explaining that the terms are intended for tagging common topics that appear in the content and are for limiting/filtering search results, and that full classification is not necessary.

Students of library science may also tend to think of classification systems as serving for taxonomies. They learn about classification systems when they study cataloging, and subject cataloging is also about where the book or other library material belongs (often literally, on the shelf). So, even librarians need training on taxonomies and the taxonomy mindset if they want to become taxonomists. I will be giving a taxonomy workshop at the Computers in Libraries conference in March, so I will be sharing these ideas with those who attend.

Monday, January 13, 2020

Intranet and ECM Taxonomies

In designing a taxonomy for tagging and retrieving content in intranets or in an enterprise content management (ECM) system, there is a fundamental question of whether to strive for creating a single comprehensive taxonomy to be applied throughout the enterprise or to have multiple specific taxonomies for different sets of content and different groups of users within the enterprise, or both. This question involves not only issues of information usability and user experience but also a mindset, which could involve a goal of “breaking down silos” by having a single enterprise taxonomy or one of encouraging “democracy” among organizational units and letting them create their own local taxonomies or terms sets (with training).
The main advantage of a single, global taxonomy is to enable users to effectively search and refine/filter results across all the content within an enterprise system using the same parameters. Users then don’t need to know in which intranet site or sub-site the desired content is to be found. Users need only become familiar with a single taxonomy, not multiple. So, it becomes easier to use. Content can be better shared and discovered.

On the other hand, more, specific taxonomies can also be of value, providing more precise retrieval results by users who know where and how to search with them. In many organizations, there are very specific sets of documents, for which a specific taxonomy would aid in retrieval, yet they can be of value to any employee. For example, in an organization that conducts research, these could be research reports or profiles of experts. In an organization that provides services, these could be documents of service descriptions, procedures, and policies. In and an organization with a large sales operation, these could be all the documents that support salespeople. The design of a taxonomy should reflect the  nature and the scope of the content and the needs of all users. Content in specialized repositories (research reports, experts, service documents, sales support documents, etc.) ought to have customized taxonomies to more fully support the best options in retrieval. For example, a taxonomy for research reports needs to be detailed in research subject areas. A taxonomy for experts would include areas of expertise, departments, locations, and job titles. A taxonomy for service support documents needs to be detailed in types of services and document types and should also include a set of terms for market segment. A set of taxonomies in support of sales should likely include product categories, sales function or process stage, market, and customer type. Meanwhile, a “generic” taxonomy, to be used across the organization, might be based on departments and types of functions/activities, along with general document types and topics.

It may be unclear who should decide and how the decision should be made regarding global, enterprise vs. specific, departmental taxonomies. The decision should probably be left to those in the organization who lead knowledge management or content strategy. The IT department, which sets up the Intranet, ECM, or SharePoint  system may have influence in this matter, based on how they choose to configure the system.  There can also be uncertainty and ambivalence over which taxonomy approach to take. During my interview with stakeholders for a recent SharePoint taxonomy consulting project, a lead IT stakeholder said that there was no policy, but that they “encourage” departments to use the same topical taxonomy. Yet at the say time, they also “create a local classification, but don’t encourage a local classification.”

Approaches to intranet taxonomy design

Let’s look more closely at the various options for intranet taxonomy design.

1. Create a general enterprise-wide taxonomy and various specific taxonomies
Benefits: Taxonomies are suited to the content
Drawbacks: Has silos and less sharing. User outside of a department may not be familiar with the departmental taxonomy.

2. Create a single comprehensive taxonomy (or set of taxonomies/facets) to cover all the internal information needs of the organization.
Benefits: There is more sharing and ease of having a single taxonomy of terms for users to refine searches by.
Drawbacks: It is more difficult for tagging with a large and potentially confusing taxonomy, where sections of the taxonomy are irrelevant to some sets of taxonomy, and some terms may have been intended for one purpose but get used for another purpose.

Other options are more creative, and hopefully IT can customize the content management and search software accordingly to support them.

1. Create an enterprise-wide taxonomy, as a master taxonomy, which is both general and specific, and various specific taxonomies, and map the specific taxonomies, term-by-term, to the master taxonomy which includes all terms. Those who tag only need to use their appropriate specific taxonomies, but those who search, making use of the master taxonomy, can have a “federated search” experience allowing discovery and retrieval across the enterprise.

2. Create a single comprehensive taxonomy with branches that can be hidden from display to those tagging content which does not require the terms from those branches of the taxonomy. This makes it easier for those who tag, not being overwhelmed with a very large taxonomy, much of which is not relevant to their content, and contains terms which could potentially be confusing and misused.

As I was struggling with the problem with my current client on whether to make a large taxonomy (500-600 terms) available for tagging in all SharePoint sites, even though it was relevant to only a minority of the sites, the IT stakeholder informed me that for designated sites he could set the display of the taxonomy for tagging of just one top-level branch of the taxonomy and hide the rest. Although no more than one branch could be displayed in this method, which would impact the hierarchical design of the taxonomy, this was the best compromised solution in this case.

I look forward to sharing and learning more about taxonomies for intranets at the upcoming IntraTeam Event Copenhagen: The European DEX Conference, where I will be giving a pre-conference workshop "Taxonomy Design & Creation."

Saturday, December 28, 2019

Taxonomy Licensing Interest

Just over a year ago I had blogged on the topic of Taxonomy Licensing. I explained that usually a customized taxonomy is best, but occasionally licensing a taxonomy is a option worth considering in certain circumstances:  as a starting point to then modify, to serve as a single facet in a faceted taxonomy, or to index content from various external sources on a defined topic area for which a good taxonomy exists. There are issues, though, such as whether to right kind of taxonomy exists and whether the license permits modification of the taxonomy.

Various organizations, companies, and even individuals have created taxonomies or other controlled vocabularies, which they have made available for license.  Whether it’s worthwhile for them to promote taxonomies that are for license is uncertain. So, a year ago I created an online survey of taxonomy (or more broadly, any controlled vocabulary) licensing interest, which I announced not only on this blog, but also the blogs of taxonomy software vendors and at various conferences. The survey stayed open for about 6 months, and there were over 60 responses to most questions.  Now it is time to share those results. Although the responses are in the context of licensing controlled vocabularies, some of the questions and responses--about the taxonomy purpose, type or subject area of interest--might reflect general interest in taxonomies. (Percentages have been rounded.)

The first question asked about interest in licensing taxonomies or other controlled vocabularies. Slightly more than half of the respondents (61%) have considered licensing taxonomies, but most have not gone any further in identifying appropriate taxonomies to license. The leading reasons given not to from those respondents who said they would not likely to license a taxonomy (22 respondents out of 66), were:

  1. Custom-created taxonomies would best serve my purposes: 59%
  2. Licensed taxonomies that are modifiable and permit commercial reuse are too expensive: 14%

The leading concerns regarding licensing a taxonomy, ranked in order were the following:
  1. Difficulty finding or lack of a suitable taxonomy
  2. Difficulty integrating a licensed taxonomy into an existing taxonomy or taxonomy set
  3. Effort to modify, adapt, and/or expand a license taxonomy
  4. Licensing fee cost
  5. Features of the licensed taxonomy missing
  6. File format and implementation issues

The types of controlled vocabularies that respondents are most interested in licensing (allowing multiple responses) were:

  1. Hierarchical taxonomy: 56%
  2.  Controlled vocabulary for part of a faceted taxonomy: 55%
  3.  Ontology: 40%
  4.  Thesaurus: 35%
  5.  Name authority file (companies, places, organizations, person names, etc.): 17%
  6.  Classification scheme (such as with alpha-numeric codes): 10%

The subject areas of controlled vocabularies that respondents are most interested in licensing (allowing multiple responses) were:

  1. Business/management/enterprise functions: 36%
  2.  Information technology/computing: 30%
  3.  Industries: 28%
  4.  Company or organization names: 26%
  5.  Products/services: 23%Health/medicine: 21%
  6.  Geographic places: 21%
  7.  Engineering & design: 20%
  8.  Law & policy: 20%
  9.  Science & math: 18%
  10.  Humanities & social sciences: 13%
  11.  Occupations or job titles: 13%
 Finance was a popular write-in option under “Other.”

The purposes that respondents said a licensed controlled vocabulary would serve (allowing multiple responses) were:

  1. Internal content management and search & retrieval: 82%
  2. Business intelligence/market research/competitive intelligence/data analysis: 32%
  3. Expertise identification: 24%
  4. Public/website content findability – commercial: 21%
  5. Education/research: 19%
  6. Ecommerce or B2B: 18%
  7. Public/website content findability – nonprofit: 15%
  8. Public/website content findability – government: 8%

The size ranges of a controlled vocabulary that respondents said they would be interested in licensing (allowing multiple responses) were:

  1. 1,000 - 5,000 concepts: 33%
  2.  More than 10,000 concepts: 26%
  3.  500 - 1,000 concepts: 21%5,000 - 10,000 concepts: 21%
  4. Less than 100 concepts: 17%
  5.  100 - 500 concepts: 14% 
The formats of a controlled vocabulary that respondents said they would be interested in licensing (allowing multiple responses, especially since some of these formats are not mutually exclusive) were:
  1.  XML: 44%
  2.  Unsure:39%
  3.  Excel (xls or xlsx): 34%
  4.  SKOS: 32%
  5.  CSV: 26%
  6.  RDF: 24%
  7.  JSON: 24%
  8.  OWL: 16%
  9. Turtle: 11%
  10. Z Thes: 8%

The leading industries of respondents were:

  1. Consulting/professional services: 18% (Perhaps taxonomy consultants, like me?)
  2. Nongovernmental/nonprofit: 18% (Perhaps because licensing restrictions for commercial re-use are not an issue.)
  3. Software/Hardware/IT: 13%
  4. Manufacturing/Construction/Engineering: 10%

Additionally, 10 other individual industries were indicated with only 2-3 individual responses each.

Conclusions from the survey include:

  • Concerns around licensing are shared, and there is no dominant single concern.
  •  Hierarchical taxonomies and vocabularies for facets of faceted taxonomies are the types most of interest.
  • The subject area of greatest interest is business/management/enterprise functions.
  •  Internal content is the leading purpose for controlled licensing.
  • Size of vocabularies of interest includes all, but the mid-range dominates.
  • Industries interested in vocabulary licensing vary, and none dominates.
  • XML and CSV/Excel or the formats of greatest interest, but a significant number are unsure of format desired.