The Accidental Taxonomist

Monday, March 11, 2013

Testing Taxonomies

As mentioned in my previous blogpost, “Evaluating Taxonomies,” taxonomy evaluation and taxonomy testing differ. While the evaluation of a taxonomy by a taxonomist is needed when a taxonomy is created by non-taxonomists (such as by subject-matter experts instead), testing of a taxonomy, on the other hand, is recommended in all cases, no matter who created the taxonomy. Following is an overview of the different kinds of testing that can or should be performed on a taxonomy prior to its implementation.

Card-Sorting

Card-sorting is probably the best known kind of testing, especially now that the prevalence of online card-sorting tools facilitates set-up and enables remote participation. It is not necessarily the best kind of testing for all situations, though. Card-sorting serves to test categorization schemes, so while it is suited for hierarchical taxonomies, it is not so appropriate for faceted taxonomies, especially with regard to how the facets are to interact with each other. It is possible, though, to card-sort test an individual facet, if that facet comprises an internal hierarchy of terms.

There are two kinds of card-sort tests, open and closed. In open card-sorts, the testers group concepts/topics together and then assign a broader category of their own; whereas in closed card sorts, the broad categories are already designated, and the testers merely categorize the specific concepts/topics within those pre-determined categories. Open card-sorting, if chosen, is therefore done earlier in the taxonomy design process, when broad categories are uncertain. A single taxonomy project may have either or both kinds of card-sorting depending on where the greatest need is for this additional input of information. Testers could be test end-users or they could be stakeholders, depending on the needs of the test.

Card-sorting is actually not really a kind of taxonomy testing but rather a form of taxonomy idea testing. Card-sorting is not performed on a completed taxonomy to test it but rather to test ideas of categories/hierarchies which later will be combined to create the taxonomy. Therefore, card-sorting is not an alternative to the other kinds of testing described below, which may subsequently be done.

Use Testing

Use-testing or use-case-testing is a necessary step after a draft taxonomy is built or nearly completed but before it is finally implemented, allowing for revisions to be made based on the test results. It is at this point that the taxonomy is put to the test to see if it will perform as hoped in search/retrieval and (if applicable) for manual tagging. This type of testing might also be called taxonomy validation.

A cross-section of different kinds of test users should be recruited to prepare several typical use cases and perhaps one especially challenging use case of content search scenarios. The user is then presented with the taxonomy (which can be in any format at this stage, whether on paper, as an Excel file or as test web page) and asked to browse the taxonomy to look for terms under which the content for the use search scenario might be found. The user performs the test, either browsing in the tester’s physical presence or via screensharing with verbal narration of what the user is doing and why. The test administrator takes notes regarding any problems in finding taxonomy terms for the use case. These findability problems should be considered as requirements for additional terms, additional nonpreferred (variant) terms to point to existing terms, or perhaps more polyhierarchy or associative relationships to help guide the user to find the desired concepts.

If the taxonomy is to be used for manual tagging or indexing, then a second, different set of use testing is needed, whereby users who perform this function should test the taxonomy for indexing of typical and challenging documents that they tend to deal with. Rather than coming up with use “cases”, the test-user-indexers merely need to come up with actual documents. The documents should represent a good cross-section of the various document types indexed. This exercise is even more straightforward than the user testing for finding content, so it could even be performed offline without the test administrator present, as long as the test-user-indexer takes good notes.

A-B Testing

In A-B Testing, the test-users are presented with two different possible scenarios and asked which they prefer. When comparing two different taxonomies or parts of taxonomies, only one or two variations should exist between the two that are compared to make the test clear-cut. You may set up a series of A-B test pairs to compare multiple variations. This kind of test is comparable to what an optometrist does for vision: “Which is better, A or B?” Since only one or two differences should be compared and tested at a time, A-B testing is most suitable to compare proposed top-level categories, rather than getting into the depths of a taxonomy, where it is not practical to conduct a detailed term-by-term comparison. Thus, A-B testing focuses on high-level structural design, navigation and browsing, and not the effectiveness of finding and retrieving content.

A-B Testing can be done at any time in the taxonomy design and build process. It is also very useful when considering a taxonomy redesign for comparing the existing taxonomy (A) to a proposed change (B). A-B Testing is usually done by presenting the test users with graphical or interactive web page mock-ups. I’ve created the B image to an existing online A image, by taking a screenshot of A and then edit it in Microsoft’s Paint accessory. Although each individual A-B test is simple, deciding what to compare and how many comparison tests to make needs to be determined, since each test takes time and resources.

Conclusions

Taxonomies should be tested, but it’s not true that any test is good. Different tests are for different purposes and fit into different stages of the taxonomy process. An inappropriate test or inappropriately timed test can be a waste of time and money.

Monday, February 25, 2013

Evaluating Taxonomies

In my last blog post, “Taxonomy Management Consulting,” I mentioned that more organizations now have taxonomies, so the need is shifting somewhat from designing and building new taxonomies to managing existing taxonomies. It might not be that simple, however, if the existing taxonomy was created and never used, created for a slightly different purpose or different content, or created by those not sufficiently knowledgeable in taxonomy design best practices. I often find that an organization that has taxonomy consulting needs typically has some pre-existing taxonomies, but they are not adequate for one reason or another.

Any pre-existing taxonomies are important as part of a taxonomy development or redesign process and should be carefully considered. Whether pre-existing taxonomies will be only a source of terms for a new taxonomy or actually the basis of a new taxonomy with some editing depends on how structured, comprehensive, and sound these pre-existing taxonomies are.

Structure: Pre-existing taxonomies may be of the type that is a simple flat list of terms with no hierarchy. These are good sources of taxonomy terms but are rarely the basis for the taxonomy.

Comprehensiveness: Often existing taxonomies cover only part of the scope of a desired full or enterprise-wide taxonomy, in which case they will serve as part of the new taxonomy.

Soundness: This concerns to what extent the taxonomy is conforms to standards (such as ANSI/NISO Z39.19) and general best practices, so that it ought to work well with the content it is intending to reference. This is where taxonomy experts can come in and make such determinations.

Evaluation Criteria

Evaluating a taxonomy for soundness typically involves checking off or rating the taxonomy against a set of pre-defined criteria regarding terms, inter-term relationships, and overall structure and design. Some of the most important criteria include the following:

Terms should be unambiguous and clear, yet not too wordy and long. If the taxonomy will be displayed for browsing, then terms should begin with key words and those that come under the same broader term should be in a somewhat consistent grammatical format.
Hierarchical relationships should conform to the ANSI/NISO Z39.19 standards of conforming to only one of the three types: generic-specific, instance, or whole-part, with perhaps limited exceptions in a corporate taxonomy that are intuitively logical and justified. (See my blog post “Deviating from Taxonomy Standards”).
Overall structure and design involves issues include the number of narrower terms for a broader term not being too few nor too many (such as 3-20), and the depth of the taxonomy being somewhat balanced and not too deep. For example, three levels deep in some places and four levels deep in others is OK, but two levels in some areas and five levels deep in others is not a well-balanced design.

Evaluation vs. Testing

Evaluating a taxonomy is not the same as testing a taxonomy. Testing a taxonomy involves using sample content and sample users in a controlled manner and can take considerable time and effort, so should not be done until after a taxonomy is determined to be generally sound. Evaluating a taxonomy, on the other hand, is to determine if it’s well constructed regardless of the content or users. Testing focuses on the specific application and use of the taxonomy and will be the topic of a future blogpost.

Taxonomy vs. Web Usability Heuristic Evaluation

Even if a numeric rating scale is used, the process is still more judgmental than scientific, and as such may be referred to as a “heuristic” analysis or evaluation. A “heuristic method” generally means evaluation, experimentation, or a trial-and-error method to find something out. The designation of heuristic evaluation has been used in website usability evaluation and from there has been carried over into taxonomy evaluation. User experience expert Jakob Nielsen first introduced the idea of heuristic evaluation to usability design back in 1990, described in his blogpost of 1995: “How to Conduct a Heuristic Evaluation.”

There are several differences, though, between taxonomy evaluation and web user interface evaluation. Although user testing of websites is not that much different from the testing of taxonomies, evaluation of taxonomies requires a more critical and analytical understanding and approach. Website usability evaluation does not require usability design experts, but taxonomy evaluation does require a level of expertise. Nielsen refers to “evaluators”, not experts, who are not much different from user testers. (Rather, the procedures in usability evaluating and testing differ.)

Another difference between website evaluation and taxonomy evaluation is that a website, even if a test dummy site, will have content, even if just mock-up pages with partial filler text, because navigation and content are integrally combined on websites. When a taxonomy, on the other hand, is at the evaluation stage, it is not implemented/linked to content, which makes it more difficult for the non-expert to evaluate. It might appear to look good on paper but not function well when implemented.

Nielson wrote: “Heuristic evaluation involves having a small set of evaluators examine the interface and judge its compliance with recognized usability principles.” If the evaluators are not experts, then it’s easier and more affordable to have multiple evaluators. When a taxonomy requires evaluation, typically just one taxonomy expert is hired, but if you can afford two separate independent expert evaluations of your taxonomy, that’s all the better.

Tuesday, January 22, 2013

Taxonomy Management Consulting

I recently wrote an article on taxonomy management for the online magazine FreePint. By “taxonomy management” I mean taxonomy maintenance, governance, and long-term planning. I’m not going to repeat that article here, because you can look it up. The short version is available without a subscription: “The Care and Feeding of Taxonomies: Taxonomy Management.” In summary, in the long version I discussed:

The reasons for managing a taxonomy
The distinction between taxonomy development and taxonomy management
The parts of an organization responsible for taxonomy management
Factors in selecting a taxonomy management software system
Components of taxonomy editorial policies
Components of taxonomy maintenance procedures and a governance plan

Writing this article got me thinking about the role of taxonomy consulting in taxonomy management. As more and more taxonomies get created, over time, the need for new taxonomy creation may diminish, while the need for better taxonomy management increases. This should be good news for those in the taxonomist profession, especially for those who serve as in-house staff taxonomists. As for those of us who are taxonomy consultants, there is still a role, just a slightly different one.

The design and creation of a new customized taxonomy is an appropriate task for an external consultant because it:

is a limited-term project that needs extra assistance while existing staff probably lacks the time
requires a specialized skill that perhaps no one on staff has
can benefit from an external point of view that is not biased, but can appreciate the perspectives of various users.

The ongoing maintenance of a taxonomy, on the other hand, is best suited for an internal staff taxonomist or information specialist, who:

can be immediately responsive to changing needs or circumstances
is familiar with the subject matter of the organization when it comes to additions or changes of highly specific taxonomy terms
can devote at least a little time each day or week as needed, but the time can be flexible.

Consultants still have a role in maintenance. They can study the issues and write the taxonomy editorial policies, indexing policies, maintenance plans, governance plans etc. In fact, this is where taxonomy consultants really serve as consultants, and not merely as taxonomy designers and initial developers. Sometimes I think the designation of “consultant” is a bit of a stretch for someone spending most of their time actually building taxonomies. On all of my taxonomy projects, however, I also do provide advice and suggestions, so do some consulting all along. Taxonomy management, though, relies more heavily on actual consulting services.

Even though the needs of many organizations are shifting from taxonomy design and creation to taxonomy maintenance and revision, there exists a lot more information (books, articles, workshops, presentations, etc.) on taxonomy design than on taxonomy management. The relative lack of written sources on taxonomy management is another reason why a taxonomy consultant can be especially helpful. Finally, like taxonomy design, taxonomy management plans and procedures also need to be tailored to the circumstances of a specific organization.

When I create a taxonomy, I take a personal interest in it and hope that it will have a long useful life, so I want to create taxonomy guidelines and maintenance plans that are part of taxonomy management to ensure that my good work is kept up to date. If I don’t create an organization’s taxonomy, I am still just as interested in providing guidance for improving and maintaining what taxonomy exists, because my ultimate interest in taxonomies is seeing them get used and being useful.

Sunday, December 30, 2012

The Remote Taxonomist

One of the characteristics of taxonomy work is that taxonomists can work remotely from their managers, colleagues, or clients, and many do. It’s not because those attracted to taxonomy work specifically want to work from home. Rather, taxonomy work is a narrow specialty, in which relatively few people are sufficiently skilled. So, when a taxonomist is needed to fill a position or serve as a consultant or contractor, often the ideal candidate is not to be found locally, and someone qualified, interested, and available lives far away.

Taxonomists are also accustomed to working independently. As an employee, a taxonomist is typically in the role of an “individual contributor” without supervisory reports yet not in a junior position that requires close supervision. In many organizations the taxonomist knows more about taxonomy than his or her supervisor.

Furthermore, taxonomy work lends itself to consulting and contracting work. Taxonomy design and development is of a project nature that requires intense work only temporarily (after which maintenance work can be part-time). Consultants make a number of visits to their client (to conduct interviews or lead workshops), but the bulk of their working time is spent remotely at their own office. Contract or freelance taxonomy editors are needed onsite even less than taxonomy consultants and like other editorial freelancers, indexers, translators, etc., typically never meet a client face-to-face.

Taxonomy work requires the involvement or input of many different people: project sponsors, managers, user interface designers, software engineers, product managers, customer service representatives, indexers, content creators or editors, and sample end-users. In most cases these stakeholders are not located in the same office anyway, so there will inevitably be some degree of remote contacts as a part of taxonomy work. Organizations that require taxonomies tend to be large, and if they are large they tend to have multiple locations. So, the taxonomist will always be remote to some of the taxonomy stakeholders, even if the taxonomist works in the headquarters office. What this means is that even in-house taxonomists develop experience and techniques in working with remote colleagues. If a taxonomist is going to be remote to many stakeholders, the taxonomist could almost as easily be remote to them all.

When I have been in a job-search mode, I have identified suitable positions in other cities and have applied to them with the query about telecommuting. More than once, the hiring manager of a position that did not mention telecommuting as an option was open to the idea of me working remotely from home when I proposed it. It can depend of the position level, though. Junior taxonomists who may require more mentoring are less suitable as remote employees that those who are experienced. On the other end, upper level positions might also be better served in-house. Recently I noticed a position for a Director of Semantic Services in another city. A director is a somewhat senior position, and while the director could be remote from those reporting to that manager, it would probably be better if the director was in the same office as that person’s manager and other senior managers to collaborate on ideas of taxonomy strategy and new opportunities.

If you are trying to decide whether to hire a remote taxonomist, it is important to consider whether that individual has had prior experience in working remotely from home, especially to be employed full-time. The remote worker needs the technology setup, organizational space, and self-discipline to separate work from personal activities. Fortunately, experienced taxonomists tend to have such remote-work experience. The further a long a taxonomist is in his or her career, the more likely that person will have had stints of working from home. Thus, it is easier to count on telecommuting experience among senior taxonomists.

I have now worked as a taxonomist from home in various capacities: a job full-time job entirely from home, a part-time (30 hours/week) job entirely from home, a full-time job one day at home but for a supervisor and team in an office across the country (the position’s originally posted location), a full-time job originally 4 days in the office but later 4 days at home, and several years of consulting and contracting from home. I wasn’t specifically seeking to work from home, but that’s how it worked out to get and keep the jobs I wanted.

Monday, December 3, 2012

Taxonomies and Content Management

Taxonomies are relevant to various applications, implementations, software products, disciplines, and industries, whereas taxonomy itself is not really a discipline or industry. This is apparent in how taxonomy shows up as a topic in presentation session in many different conferences. These include conferences and fields of: knowledge management, enterprise search, content management, digital asset management, semantic technologies, text analytics, document management, records management, indexing, information architecture and user experience.

Content management and content technology was the subject of the most recent conference I attended, the Gilbane Conference in Boston, November 28-29. The Gilbane Conference, now in its 9th year takes place annually the week after Thanksgiving in (end of November or beginning of December) in Boston and often also in San Francisco in May or June. The conference, named after its founder and chair, Frank Gilbane, has the tag-line “Content, Collaboration & Customers – Managing & Enhancing Experience.” Sessions are divided into four tracks: (1) Customers & Engagement, (2) Colleagues & Collaboration, (3) Content Technologies & Infrastructure, and (4) Web & Mobile Publishing.

Taxonomies at this year’s Gilbane conference were the focus of two presentations, and were mentioned in many others. Just as content management strategies and systems may be specialized for either internal/enterprise content or for external/public web content, so may taxonomies be applied either internally or externally (and sometimes both). So, it was appropriate that one presentation on taxonomies, “Value of Taxonomy Management: Research Results” by Joseph Busch, focused on enterprise content taxonomies, and the other, “Taxonomies for E-Commerce,” which I presented, focused on public website taxonomies.

The connection between taxonomies and content management is a very important one. A taxonomy does not do much good when it stands alone. Its purpose of existence is typically to facilitate finadability and retrieval of specific content, whether by browsing or searching. On the other side, content is not of much use if it cannot be found. Content management refers to managing the workflow and lifecycle of content from the planning stage and creation/collection stage through the disposition/archiving stage, with an analysis/evaluation stage bringing it full-circle. There is typically a sub-phase for content organizing, categorizing, metadata-assigning, or indexing. This is where taxonomy comes in: to provide structured categories and/or to provide a consistent vocabulary for metadata and indexing.

The field of content management is often defined in terms of its products: content management systems (CMS) and their variations, which include enterprise content management (ECM)/document management systems and Web Content Management (WCM) systems. The software vendors are an important part of conferences, such as Gilbane, and are also the subject of analysis and comparison by industry analysis firms such as The Real Story Group, CMS Watch, IDC, Forrester Research, and the Digital Clarity Group. Content management tools do include capabilities for managing taxonomies, vocabularies, or metadata, but the capabilities vary. For anything but a simple or small taxonomy, it might be preferable to create the taxonomy externally in a dedicated taxonomy management tool and then import it into the content management system. The limitations of a content management system in the area of taxonomy management, therefore, should not necessarily limit the taxonomy.

Content management and content management systems focus on processes, and that it’s a good way to look at taxonomies, too. Taxonomies are not static, but need follow a life cycle, as does content: planned and designed, developed and edited, possibly translated, published or implemented, used in tagging, then used in browsing and searching, and finally reviewed an analyzed for further revision. Governance is also an important for both content management and taxonomy management.

The biggest challenge to integrating taxonomies with content management strategy and systems is not technical but rather in human resources. A lot of time, energy, and money is put into selecting and implementing a content management system and planning a content strategy around it. Taxonomy is only one piece of the puzzle, and may not always get the investment of time and money it deserves for a full and proper design and development. However, the better a taxonomy is designed, the better it works.