The Accidental Taxonomist

Monday, February 25, 2013

Evaluating Taxonomies

In my last blog post, “Taxonomy Management Consulting,” I mentioned that more organizations now have taxonomies, so the need is shifting somewhat from designing and building new taxonomies to managing existing taxonomies. It might not be that simple, however, if the existing taxonomy was created and never used, created for a slightly different purpose or different content, or created by those not sufficiently knowledgeable in taxonomy design best practices. I often find that an organization that has taxonomy consulting needs typically has some pre-existing taxonomies, but they are not adequate for one reason or another.

Any pre-existing taxonomies are important as part of a taxonomy development or redesign process and should be carefully considered. Whether pre-existing taxonomies will be only a source of terms for a new taxonomy or actually the basis of a new taxonomy with some editing depends on how structured, comprehensive, and sound these pre-existing taxonomies are.

Structure: Pre-existing taxonomies may be of the type that is a simple flat list of terms with no hierarchy. These are good sources of taxonomy terms but are rarely the basis for the taxonomy.

Comprehensiveness: Often existing taxonomies cover only part of the scope of a desired full or enterprise-wide taxonomy, in which case they will serve as part of the new taxonomy.

Soundness: This concerns to what extent the taxonomy is conforms to standards (such as ANSI/NISO Z39.19) and general best practices, so that it ought to work well with the content it is intending to reference. This is where taxonomy experts can come in and make such determinations.

Evaluation Criteria

Evaluating a taxonomy for soundness typically involves checking off or rating the taxonomy against a set of pre-defined criteria regarding terms, inter-term relationships, and overall structure and design. Some of the most important criteria include the following:

Terms should be unambiguous and clear, yet not too wordy and long. If the taxonomy will be displayed for browsing, then terms should begin with key words and those that come under the same broader term should be in a somewhat consistent grammatical format.
Hierarchical relationships should conform to the ANSI/NISO Z39.19 standards of conforming to only one of the three types: generic-specific, instance, or whole-part, with perhaps limited exceptions in a corporate taxonomy that are intuitively logical and justified. (See my blog post “Deviating from Taxonomy Standards”).
Overall structure and design involves issues include the number of narrower terms for a broader term not being too few nor too many (such as 3-20), and the depth of the taxonomy being somewhat balanced and not too deep. For example, three levels deep in some places and four levels deep in others is OK, but two levels in some areas and five levels deep in others is not a well-balanced design.

Evaluation vs. Testing

Evaluating a taxonomy is not the same as testing a taxonomy. Testing a taxonomy involves using sample content and sample users in a controlled manner and can take considerable time and effort, so should not be done until after a taxonomy is determined to be generally sound. Evaluating a taxonomy, on the other hand, is to determine if it’s well constructed regardless of the content or users. Testing focuses on the specific application and use of the taxonomy and will be the topic of a future blogpost.

Taxonomy vs. Web Usability Heuristic Evaluation

Even if a numeric rating scale is used, the process is still more judgmental than scientific, and as such may be referred to as a “heuristic” analysis or evaluation. A “heuristic method” generally means evaluation, experimentation, or a trial-and-error method to find something out. The designation of heuristic evaluation has been used in website usability evaluation and from there has been carried over into taxonomy evaluation. User experience expert Jakob Nielsen first introduced the idea of heuristic evaluation to usability design back in 1990, described in his blogpost of 1995: “How to Conduct a Heuristic Evaluation.”

There are several differences, though, between taxonomy evaluation and web user interface evaluation. Although user testing of websites is not that much different from the testing of taxonomies, evaluation of taxonomies requires a more critical and analytical understanding and approach. Website usability evaluation does not require usability design experts, but taxonomy evaluation does require a level of expertise. Nielsen refers to “evaluators”, not experts, who are not much different from user testers. (Rather, the procedures in usability evaluating and testing differ.)

Another difference between website evaluation and taxonomy evaluation is that a website, even if a test dummy site, will have content, even if just mock-up pages with partial filler text, because navigation and content are integrally combined on websites. When a taxonomy, on the other hand, is at the evaluation stage, it is not implemented/linked to content, which makes it more difficult for the non-expert to evaluate. It might appear to look good on paper but not function well when implemented.

Nielson wrote: “Heuristic evaluation involves having a small set of evaluators examine the interface and judge its compliance with recognized usability principles.” If the evaluators are not experts, then it’s easier and more affordable to have multiple evaluators. When a taxonomy requires evaluation, typically just one taxonomy expert is hired, but if you can afford two separate independent expert evaluations of your taxonomy, that’s all the better.

Tuesday, January 22, 2013

Taxonomy Management Consulting

I recently wrote an article on taxonomy management for the online magazine FreePint. By “taxonomy management” I mean taxonomy maintenance, governance, and long-term planning. I’m not going to repeat that article here, because you can look it up. The short version is available without a subscription: “The Care and Feeding of Taxonomies: Taxonomy Management.” In summary, in the long version I discussed:

The reasons for managing a taxonomy
The distinction between taxonomy development and taxonomy management
The parts of an organization responsible for taxonomy management
Factors in selecting a taxonomy management software system
Components of taxonomy editorial policies
Components of taxonomy maintenance procedures and a governance plan

Writing this article got me thinking about the role of taxonomy consulting in taxonomy management. As more and more taxonomies get created, over time, the need for new taxonomy creation may diminish, while the need for better taxonomy management increases. This should be good news for those in the taxonomist profession, especially for those who serve as in-house staff taxonomists. As for those of us who are taxonomy consultants, there is still a role, just a slightly different one.

The design and creation of a new customized taxonomy is an appropriate task for an external consultant because it:

is a limited-term project that needs extra assistance while existing staff probably lacks the time
requires a specialized skill that perhaps no one on staff has
can benefit from an external point of view that is not biased, but can appreciate the perspectives of various users.

The ongoing maintenance of a taxonomy, on the other hand, is best suited for an internal staff taxonomist or information specialist, who:

can be immediately responsive to changing needs or circumstances
is familiar with the subject matter of the organization when it comes to additions or changes of highly specific taxonomy terms
can devote at least a little time each day or week as needed, but the time can be flexible.

Consultants still have a role in maintenance. They can study the issues and write the taxonomy editorial policies, indexing policies, maintenance plans, governance plans etc. In fact, this is where taxonomy consultants really serve as consultants, and not merely as taxonomy designers and initial developers. Sometimes I think the designation of “consultant” is a bit of a stretch for someone spending most of their time actually building taxonomies. On all of my taxonomy projects, however, I also do provide advice and suggestions, so do some consulting all along. Taxonomy management, though, relies more heavily on actual consulting services.

Even though the needs of many organizations are shifting from taxonomy design and creation to taxonomy maintenance and revision, there exists a lot more information (books, articles, workshops, presentations, etc.) on taxonomy design than on taxonomy management. The relative lack of written sources on taxonomy management is another reason why a taxonomy consultant can be especially helpful. Finally, like taxonomy design, taxonomy management plans and procedures also need to be tailored to the circumstances of a specific organization.

When I create a taxonomy, I take a personal interest in it and hope that it will have a long useful life, so I want to create taxonomy guidelines and maintenance plans that are part of taxonomy management to ensure that my good work is kept up to date. If I don’t create an organization’s taxonomy, I am still just as interested in providing guidance for improving and maintaining what taxonomy exists, because my ultimate interest in taxonomies is seeing them get used and being useful.

Sunday, December 30, 2012

The Remote Taxonomist

One of the characteristics of taxonomy work is that taxonomists can work remotely from their managers, colleagues, or clients, and many do. It’s not because those attracted to taxonomy work specifically want to work from home. Rather, taxonomy work is a narrow specialty, in which relatively few people are sufficiently skilled. So, when a taxonomist is needed to fill a position or serve as a consultant or contractor, often the ideal candidate is not to be found locally, and someone qualified, interested, and available lives far away.

Taxonomists are also accustomed to working independently. As an employee, a taxonomist is typically in the role of an “individual contributor” without supervisory reports yet not in a junior position that requires close supervision. In many organizations the taxonomist knows more about taxonomy than his or her supervisor.

Furthermore, taxonomy work lends itself to consulting and contracting work. Taxonomy design and development is of a project nature that requires intense work only temporarily (after which maintenance work can be part-time). Consultants make a number of visits to their client (to conduct interviews or lead workshops), but the bulk of their working time is spent remotely at their own office. Contract or freelance taxonomy editors are needed onsite even less than taxonomy consultants and like other editorial freelancers, indexers, translators, etc., typically never meet a client face-to-face.

Taxonomy work requires the involvement or input of many different people: project sponsors, managers, user interface designers, software engineers, product managers, customer service representatives, indexers, content creators or editors, and sample end-users. In most cases these stakeholders are not located in the same office anyway, so there will inevitably be some degree of remote contacts as a part of taxonomy work. Organizations that require taxonomies tend to be large, and if they are large they tend to have multiple locations. So, the taxonomist will always be remote to some of the taxonomy stakeholders, even if the taxonomist works in the headquarters office. What this means is that even in-house taxonomists develop experience and techniques in working with remote colleagues. If a taxonomist is going to be remote to many stakeholders, the taxonomist could almost as easily be remote to them all.

When I have been in a job-search mode, I have identified suitable positions in other cities and have applied to them with the query about telecommuting. More than once, the hiring manager of a position that did not mention telecommuting as an option was open to the idea of me working remotely from home when I proposed it. It can depend of the position level, though. Junior taxonomists who may require more mentoring are less suitable as remote employees that those who are experienced. On the other end, upper level positions might also be better served in-house. Recently I noticed a position for a Director of Semantic Services in another city. A director is a somewhat senior position, and while the director could be remote from those reporting to that manager, it would probably be better if the director was in the same office as that person’s manager and other senior managers to collaborate on ideas of taxonomy strategy and new opportunities.

If you are trying to decide whether to hire a remote taxonomist, it is important to consider whether that individual has had prior experience in working remotely from home, especially to be employed full-time. The remote worker needs the technology setup, organizational space, and self-discipline to separate work from personal activities. Fortunately, experienced taxonomists tend to have such remote-work experience. The further a long a taxonomist is in his or her career, the more likely that person will have had stints of working from home. Thus, it is easier to count on telecommuting experience among senior taxonomists.

I have now worked as a taxonomist from home in various capacities: a job full-time job entirely from home, a part-time (30 hours/week) job entirely from home, a full-time job one day at home but for a supervisor and team in an office across the country (the position’s originally posted location), a full-time job originally 4 days in the office but later 4 days at home, and several years of consulting and contracting from home. I wasn’t specifically seeking to work from home, but that’s how it worked out to get and keep the jobs I wanted.

Monday, December 3, 2012

Taxonomies and Content Management

Taxonomies are relevant to various applications, implementations, software products, disciplines, and industries, whereas taxonomy itself is not really a discipline or industry. This is apparent in how taxonomy shows up as a topic in presentation session in many different conferences. These include conferences and fields of: knowledge management, enterprise search, content management, digital asset management, semantic technologies, text analytics, document management, records management, indexing, information architecture and user experience.

Content management and content technology was the subject of the most recent conference I attended, the Gilbane Conference in Boston, November 28-29. The Gilbane Conference, now in its 9th year takes place annually the week after Thanksgiving in (end of November or beginning of December) in Boston and often also in San Francisco in May or June. The conference, named after its founder and chair, Frank Gilbane, has the tag-line “Content, Collaboration & Customers – Managing & Enhancing Experience.” Sessions are divided into four tracks: (1) Customers & Engagement, (2) Colleagues & Collaboration, (3) Content Technologies & Infrastructure, and (4) Web & Mobile Publishing.

Taxonomies at this year’s Gilbane conference were the focus of two presentations, and were mentioned in many others. Just as content management strategies and systems may be specialized for either internal/enterprise content or for external/public web content, so may taxonomies be applied either internally or externally (and sometimes both). So, it was appropriate that one presentation on taxonomies, “Value of Taxonomy Management: Research Results” by Joseph Busch, focused on enterprise content taxonomies, and the other, “Taxonomies for E-Commerce,” which I presented, focused on public website taxonomies.

The connection between taxonomies and content management is a very important one. A taxonomy does not do much good when it stands alone. Its purpose of existence is typically to facilitate finadability and retrieval of specific content, whether by browsing or searching. On the other side, content is not of much use if it cannot be found. Content management refers to managing the workflow and lifecycle of content from the planning stage and creation/collection stage through the disposition/archiving stage, with an analysis/evaluation stage bringing it full-circle. There is typically a sub-phase for content organizing, categorizing, metadata-assigning, or indexing. This is where taxonomy comes in: to provide structured categories and/or to provide a consistent vocabulary for metadata and indexing.

The field of content management is often defined in terms of its products: content management systems (CMS) and their variations, which include enterprise content management (ECM)/document management systems and Web Content Management (WCM) systems. The software vendors are an important part of conferences, such as Gilbane, and are also the subject of analysis and comparison by industry analysis firms such as The Real Story Group, CMS Watch, IDC, Forrester Research, and the Digital Clarity Group. Content management tools do include capabilities for managing taxonomies, vocabularies, or metadata, but the capabilities vary. For anything but a simple or small taxonomy, it might be preferable to create the taxonomy externally in a dedicated taxonomy management tool and then import it into the content management system. The limitations of a content management system in the area of taxonomy management, therefore, should not necessarily limit the taxonomy.

Content management and content management systems focus on processes, and that it’s a good way to look at taxonomies, too. Taxonomies are not static, but need follow a life cycle, as does content: planned and designed, developed and edited, possibly translated, published or implemented, used in tagging, then used in browsing and searching, and finally reviewed an analyzed for further revision. Governance is also an important for both content management and taxonomy management.

The biggest challenge to integrating taxonomies with content management strategy and systems is not technical but rather in human resources. A lot of time, energy, and money is put into selecting and implementing a content management system and planning a content strategy around it. Taxonomy is only one piece of the puzzle, and may not always get the investment of time and money it deserves for a full and proper design and development. However, the better a taxonomy is designed, the better it works.

Monday, November 26, 2012

E-Commerce Taxonomies

Happy Cyber-Monday! Coincidentally, this week, which is cyber-week for some retailers, I am giving a conference presentation, at Gilbane in Boston on November 29, on “Taxonomies for E-Commerce.”

As online shopping grows, the organization of products for sale on e-commerce websites becomes increasingly important, and there is also more standardization. Websites present the option to either search (used by customers who know what they want and what to call it), and browse (used by customers who are not sure about what they want or what to call it). For holiday gift shopping, browsing tends to be more common than usual, so displayed taxonomies take on a particularly high visibility at this time.

For browsing, e-commerce websites typically organize their products into hierarchical categories, which are then narrowed by the use of facets. Top level categories correspond to “departments” and could be as few as 2-3 for a specialty retailer or as many as 12-17 for a general/mass merchandize retailer. Usually the hierarchy extends one or two more levels deeper, although a very large retailer may find the need for an occasional fourth level.

At the lower levels of the hierarchy, the customer may then refine the set of products by use of facets (also known as attributes, filters, refinements, dimensions, “limit by,” or “narrow by”). The facets are for characteristics that cut across multiple categories. Facets may be for size, color, price range, material, brand, style, special features, and perhaps even customer rating. These facets will vary depending on the department or broader category type. The terms within a facet, known as “facet values” or “attribute values,” are usually in a flat list The user selects a value from each of multiple facets in combination. In some cases, if check boxes are provided, the user is permitted to select more than one value from within the same facet.

Typically retailers are more concerned about the selection and implementation of technology than in the design of the taxonomy. After all, a hierarchical taxonomy of products would appear simple to design, and even the facets are not too challenging to develop, especially with lots of competitor e-commerce websites to analyze and compare. However, my experience working as a taxonomy consultant on e-commerce taxonomies has led me to realize that creating and editing e-commerce taxonomies is not as easy as it seems.

My conference presentation discusses seven challenges:

1. Distinguishing a subcategory from a facet value
At the higher levels, categories are obvious. Standard facets (size, color, price range, etc.) are also obvious. But the distinction between the most specific subcategories and specialized facets can get blurred. Can “type” be a facet? Is a “plaid shirt” a subcategory of shirts, or is plaid a value in a “pattern/type” facet? Are gas and electric stoves subcategories of stoves, or is “energy source” a facet of stoves? Factors to consider in making these decisions include user perceptions and the number of existing levels of subcategories and numbers of facets.

2. Different categorization options
There are often product categories that are difficult to classify. For example, do video games belong in “Toys and Games” or in “Electronics”? Does Home Theater belong in the “Television/Video” or the “Audio/Stereo department? Having the category in both locations, as the polyhierarchy feature of a taxonomy, is possible. But a breadcrumb trail might follow only a single path, not both, and too many polyhierarchies can be confusing to users.

3. Related items
E-commerce taxonomies are hierarchical and generally do not have associative/non-hierarchical relationships between categories. It is not needed in most cases, but accessories to products and related services (installation, repair, etc.) are clearly related to specific product categories. Taxonomic standards might have to be ignored if making such categories narrower to their main product is the only option. But other, creative display options might be possible.

4. Sort order options
Generally a long list of terms, over a dozen, is easier to scan if alphabetized, whereas a short list of under a dozen terms is better suited to some other prescribed “logical” order. Sort order inconsistency will result, however, if the number of subcategories fluctuates. Determining the “logical” order is also a challenge and often centers around what is most important or popular.

5. Competitor website comparisons
For e-commerce taxonomies (unlike enterprise taxonomies), it’s great to be able to compare with competitors. However, often a retailer is somewhat unique, and no single competitor has exactly the same product categories. Furthermore, it’s important to distinguish between category and content comparison from design comparison. Design may be an extension of a retailer’s overall unique brand graphic design.

6. Web site vs. physical store organization
Physical (“brick and mortar”) stores have their own organization for products that might not work online, but there may be pressure to mimic physical store organization to provide a consistent user experience. While it may make sense to have the biggest sellers up front or at the top of the list, product size (a factor in physical store organization), should not necessarily be a factor in online organization.

7. Business needs vs. taxonomy best practices
Online merchants might want to make certain product categories more prominent, by changing the sort order, adding polyhierarchy locations, or even moving a subcategory up a level. It’s important to keep the integrity of the taxonomy intact, though, so that it remains intuitive for the customers to use.

In sum, product taxonomies are not as simple to create as might be expected. Taxonomy design may be under constraints, and business needs can challenge taxonomy standards. Creative solutions may be needed, and customer perspectives need to be considered through creating personas and/or through user testing.