The Accidental Taxonomist: 2012

Sunday, December 30, 2012

The Remote Taxonomist

One of the characteristics of taxonomy work is that taxonomists can work remotely from their managers, colleagues, or clients, and many do. It’s not because those attracted to taxonomy work specifically want to work from home. Rather, taxonomy work is a narrow specialty, in which relatively few people are sufficiently skilled. So, when a taxonomist is needed to fill a position or serve as a consultant or contractor, often the ideal candidate is not to be found locally, and someone qualified, interested, and available lives far away.

Taxonomists are also accustomed to working independently. As an employee, a taxonomist is typically in the role of an “individual contributor” without supervisory reports yet not in a junior position that requires close supervision. In many organizations the taxonomist knows more about taxonomy than his or her supervisor.

Furthermore, taxonomy work lends itself to consulting and contracting work. Taxonomy design and development is of a project nature that requires intense work only temporarily (after which maintenance work can be part-time). Consultants make a number of visits to their client (to conduct interviews or lead workshops), but the bulk of their working time is spent remotely at their own office. Contract or freelance taxonomy editors are needed onsite even less than taxonomy consultants and like other editorial freelancers, indexers, translators, etc., typically never meet a client face-to-face.

Taxonomy work requires the involvement or input of many different people: project sponsors, managers, user interface designers, software engineers, product managers, customer service representatives, indexers, content creators or editors, and sample end-users. In most cases these stakeholders are not located in the same office anyway, so there will inevitably be some degree of remote contacts as a part of taxonomy work. Organizations that require taxonomies tend to be large, and if they are large they tend to have multiple locations. So, the taxonomist will always be remote to some of the taxonomy stakeholders, even if the taxonomist works in the headquarters office. What this means is that even in-house taxonomists develop experience and techniques in working with remote colleagues. If a taxonomist is going to be remote to many stakeholders, the taxonomist could almost as easily be remote to them all.

When I have been in a job-search mode, I have identified suitable positions in other cities and have applied to them with the query about telecommuting. More than once, the hiring manager of a position that did not mention telecommuting as an option was open to the idea of me working remotely from home when I proposed it. It can depend of the position level, though. Junior taxonomists who may require more mentoring are less suitable as remote employees that those who are experienced. On the other end, upper level positions might also be better served in-house. Recently I noticed a position for a Director of Semantic Services in another city. A director is a somewhat senior position, and while the director could be remote from those reporting to that manager, it would probably be better if the director was in the same office as that person’s manager and other senior managers to collaborate on ideas of taxonomy strategy and new opportunities.

If you are trying to decide whether to hire a remote taxonomist, it is important to consider whether that individual has had prior experience in working remotely from home, especially to be employed full-time. The remote worker needs the technology setup, organizational space, and self-discipline to separate work from personal activities. Fortunately, experienced taxonomists tend to have such remote-work experience. The further a long a taxonomist is in his or her career, the more likely that person will have had stints of working from home. Thus, it is easier to count on telecommuting experience among senior taxonomists.

I have now worked as a taxonomist from home in various capacities: a job full-time job entirely from home, a part-time (30 hours/week) job entirely from home, a full-time job one day at home but for a supervisor and team in an office across the country (the position’s originally posted location), a full-time job originally 4 days in the office but later 4 days at home, and several years of consulting and contracting from home. I wasn’t specifically seeking to work from home, but that’s how it worked out to get and keep the jobs I wanted.

Monday, December 3, 2012

Taxonomies and Content Management

Taxonomies are relevant to various applications, implementations, software products, disciplines, and industries, whereas taxonomy itself is not really a discipline or industry. This is apparent in how taxonomy shows up as a topic in presentation session in many different conferences. These include conferences and fields of: knowledge management, enterprise search, content management, digital asset management, semantic technologies, text analytics, document management, records management, indexing, information architecture and user experience.

Content management and content technology was the subject of the most recent conference I attended, the Gilbane Conference in Boston, November 28-29. The Gilbane Conference, now in its 9th year takes place annually the week after Thanksgiving in (end of November or beginning of December) in Boston and often also in San Francisco in May or June. The conference, named after its founder and chair, Frank Gilbane, has the tag-line “Content, Collaboration & Customers – Managing & Enhancing Experience.” Sessions are divided into four tracks: (1) Customers & Engagement, (2) Colleagues & Collaboration, (3) Content Technologies & Infrastructure, and (4) Web & Mobile Publishing.

Taxonomies at this year’s Gilbane conference were the focus of two presentations, and were mentioned in many others. Just as content management strategies and systems may be specialized for either internal/enterprise content or for external/public web content, so may taxonomies be applied either internally or externally (and sometimes both). So, it was appropriate that one presentation on taxonomies, “Value of Taxonomy Management: Research Results” by Joseph Busch, focused on enterprise content taxonomies, and the other, “Taxonomies for E-Commerce,” which I presented, focused on public website taxonomies.

The connection between taxonomies and content management is a very important one. A taxonomy does not do much good when it stands alone. Its purpose of existence is typically to facilitate finadability and retrieval of specific content, whether by browsing or searching. On the other side, content is not of much use if it cannot be found. Content management refers to managing the workflow and lifecycle of content from the planning stage and creation/collection stage through the disposition/archiving stage, with an analysis/evaluation stage bringing it full-circle. There is typically a sub-phase for content organizing, categorizing, metadata-assigning, or indexing. This is where taxonomy comes in: to provide structured categories and/or to provide a consistent vocabulary for metadata and indexing.

The field of content management is often defined in terms of its products: content management systems (CMS) and their variations, which include enterprise content management (ECM)/document management systems and Web Content Management (WCM) systems. The software vendors are an important part of conferences, such as Gilbane, and are also the subject of analysis and comparison by industry analysis firms such as The Real Story Group, CMS Watch, IDC, Forrester Research, and the Digital Clarity Group. Content management tools do include capabilities for managing taxonomies, vocabularies, or metadata, but the capabilities vary. For anything but a simple or small taxonomy, it might be preferable to create the taxonomy externally in a dedicated taxonomy management tool and then import it into the content management system. The limitations of a content management system in the area of taxonomy management, therefore, should not necessarily limit the taxonomy.

Content management and content management systems focus on processes, and that it’s a good way to look at taxonomies, too. Taxonomies are not static, but need follow a life cycle, as does content: planned and designed, developed and edited, possibly translated, published or implemented, used in tagging, then used in browsing and searching, and finally reviewed an analyzed for further revision. Governance is also an important for both content management and taxonomy management.

The biggest challenge to integrating taxonomies with content management strategy and systems is not technical but rather in human resources. A lot of time, energy, and money is put into selecting and implementing a content management system and planning a content strategy around it. Taxonomy is only one piece of the puzzle, and may not always get the investment of time and money it deserves for a full and proper design and development. However, the better a taxonomy is designed, the better it works.

Monday, November 26, 2012

E-Commerce Taxonomies

Happy Cyber-Monday! Coincidentally, this week, which is cyber-week for some retailers, I am giving a conference presentation, at Gilbane in Boston on November 29, on “Taxonomies for E-Commerce.”

As online shopping grows, the organization of products for sale on e-commerce websites becomes increasingly important, and there is also more standardization. Websites present the option to either search (used by customers who know what they want and what to call it), and browse (used by customers who are not sure about what they want or what to call it). For holiday gift shopping, browsing tends to be more common than usual, so displayed taxonomies take on a particularly high visibility at this time.

For browsing, e-commerce websites typically organize their products into hierarchical categories, which are then narrowed by the use of facets. Top level categories correspond to “departments” and could be as few as 2-3 for a specialty retailer or as many as 12-17 for a general/mass merchandize retailer. Usually the hierarchy extends one or two more levels deeper, although a very large retailer may find the need for an occasional fourth level.

At the lower levels of the hierarchy, the customer may then refine the set of products by use of facets (also known as attributes, filters, refinements, dimensions, “limit by,” or “narrow by”). The facets are for characteristics that cut across multiple categories. Facets may be for size, color, price range, material, brand, style, special features, and perhaps even customer rating. These facets will vary depending on the department or broader category type. The terms within a facet, known as “facet values” or “attribute values,” are usually in a flat list The user selects a value from each of multiple facets in combination. In some cases, if check boxes are provided, the user is permitted to select more than one value from within the same facet.

Typically retailers are more concerned about the selection and implementation of technology than in the design of the taxonomy. After all, a hierarchical taxonomy of products would appear simple to design, and even the facets are not too challenging to develop, especially with lots of competitor e-commerce websites to analyze and compare. However, my experience working as a taxonomy consultant on e-commerce taxonomies has led me to realize that creating and editing e-commerce taxonomies is not as easy as it seems.

My conference presentation discusses seven challenges:

1. Distinguishing a subcategory from a facet value
At the higher levels, categories are obvious. Standard facets (size, color, price range, etc.) are also obvious. But the distinction between the most specific subcategories and specialized facets can get blurred. Can “type” be a facet? Is a “plaid shirt” a subcategory of shirts, or is plaid a value in a “pattern/type” facet? Are gas and electric stoves subcategories of stoves, or is “energy source” a facet of stoves? Factors to consider in making these decisions include user perceptions and the number of existing levels of subcategories and numbers of facets.

2. Different categorization options
There are often product categories that are difficult to classify. For example, do video games belong in “Toys and Games” or in “Electronics”? Does Home Theater belong in the “Television/Video” or the “Audio/Stereo department? Having the category in both locations, as the polyhierarchy feature of a taxonomy, is possible. But a breadcrumb trail might follow only a single path, not both, and too many polyhierarchies can be confusing to users.

3. Related items
E-commerce taxonomies are hierarchical and generally do not have associative/non-hierarchical relationships between categories. It is not needed in most cases, but accessories to products and related services (installation, repair, etc.) are clearly related to specific product categories. Taxonomic standards might have to be ignored if making such categories narrower to their main product is the only option. But other, creative display options might be possible.

4. Sort order options
Generally a long list of terms, over a dozen, is easier to scan if alphabetized, whereas a short list of under a dozen terms is better suited to some other prescribed “logical” order. Sort order inconsistency will result, however, if the number of subcategories fluctuates. Determining the “logical” order is also a challenge and often centers around what is most important or popular.

5. Competitor website comparisons
For e-commerce taxonomies (unlike enterprise taxonomies), it’s great to be able to compare with competitors. However, often a retailer is somewhat unique, and no single competitor has exactly the same product categories. Furthermore, it’s important to distinguish between category and content comparison from design comparison. Design may be an extension of a retailer’s overall unique brand graphic design.

6. Web site vs. physical store organization
Physical (“brick and mortar”) stores have their own organization for products that might not work online, but there may be pressure to mimic physical store organization to provide a consistent user experience. While it may make sense to have the biggest sellers up front or at the top of the list, product size (a factor in physical store organization), should not necessarily be a factor in online organization.

7. Business needs vs. taxonomy best practices
Online merchants might want to make certain product categories more prominent, by changing the sort order, adding polyhierarchy locations, or even moving a subcategory up a level. It’s important to keep the integrity of the taxonomy intact, though, so that it remains intuitive for the customers to use.

In sum, product taxonomies are not as simple to create as might be expected. Taxonomy design may be under constraints, and business needs can challenge taxonomy standards. Creative solutions may be needed, and customer perspectives need to be considered through creating personas and/or through user testing.

Thursday, November 1, 2012

From Taxonomies to Ontologies: Customized and Semantic Relationships

At this year’s Taxonomy Boot Camp conference, I was invited to present on the panel giving 5-minute “Pecha Kucha” lightning talks, for which this year’s theme was ontology. Just as there are different understandings and usages of “taxonomy,” so are there different understandings and usages of “ontology.” You can come to if from different angles. If you come to ontologies from the experience of taxonomies and the field of information management, then, most simply, an ontology is a more complex type of taxonomy that contains richer information.

In my brief presentation, “From Accidental Taxonomist to Accidental Ontologist,” I summed up the differences between taxonomies and ontologies as follows:

Relationships: Taxonomies have hierarchical and sometimes a simple “related term” associative, but ontologies have semantic relationships, which are custom-created.
Term Attributes: Taxonomies generally don’t have term attributes, but ontologies do.
Term Classes: Taxonomies generally don’t have classes for terms, unless you consider facets as classes, but ontologies do.
Guidelines/Standards: Taxonomies should follow the ANSI/NISO Z39.19 (2005) or ISO 25964, whereas ontologies are expected to follow the Web Ontology Language (OWL) guidelines and make use of the Resource Description Framework (RDF).
Purposes: Taxonomies support indexing/tagging, categorization, and/or classification of content, and in turn information findability and retrieval. The primary purpose of an ontology is to describe a domain of knowledge, and support of indexing/tagging, categorization, classification, findability, and retrieval can be secondary.
Tools: Some software supports the creation of only taxonomies, some software is for ontologies, and some software can do both quite well. Additionally, some taxonomy/thesaurus software can support most, if not all, features of ontologies.

Coming at ontologies from taxonomies, the biggest distinguishing feature of ontologies is the semantic nature of the relationships.

In a taxonomy or thesaurus, you may have generic relationships, such as:

     Automobile industry RT (related term) Cars, and
     Cars RT (related term) Automobile industry

     Ford Motor Company NT (narrower term) Lincoln Division, and
     Lincoln Division BT (broader term) Ford Motor Company

In an ontology, you may have customized, semantic relationships, such as:

     Automobile industry MAN (manufactures) Cars, and
     Cars IND (manufactured by the industry) Automobile industry

     Ford Motor Company SUB (has subsidiary or division) Lincoln Division, and
     Lincoln Division PAR (has parent) Ford Motor Company

If you can customize the relationships, does this change a taxonomy into a ontology? No. Customized relationships are just one feature of an ontology, although perhaps the most important feature. In my online course on taxonomies, although I don’t teach how to create ontologies, I do provide a lesson on customized/semantic relationships. It is often desirable to create a more complex taxonomy without necessarily meeting all the requirements of an ontology.

Furthermore, a customized relationship might not be fully semantic. In the example above, the second set of relationships are customized, because they are designated by the ontologist for the particular case. The relationships are also “semantic” because they contain specific meaning. (Semantic means “has meaning.”) It is possible to customize relationships while still not making them fully semantic. You may decide to simply rename the standard relationships for your particular application and audience. For example, you might rename broader term (BT)/narrower term (NT) as “parent/child,” or rename Related Term as “see also.” If your taxonomy/thesaurus software is more sophisticated, it will allow you to specify any number of customized relationships, and thus you can add more nuances of meaning.

A key component of truly semantic relationships as expected in ontologies is the ability to create directional relationships that are distinct in each direction, with reciprocity. Most of these semantic relationships will be variants of “related term” (RT), rather than variants of the hierarchical relationship. The generic RT relationship, however, is singularly bidirectional. If you simply customized it by renaming it, it would have to be the same in both directions, such has “has partner.” To create a semantic relationship pair, such as MAN (manufactures) and IND (manufactured by the industry), you need a tool that supports ontological relationships and not just “customized” relationships.

If your tool supports customized relationships but not the ability to create distinct pairs of directional relationships that are associative rather than hierarchical, the results cans still be very useful. You may have a “near ontology” if not a strictly defined ontology. For example, you could rename the singular “related term” (RT) as “Manufacturer-Product” with an abbreviation such as MAN-PRO (Credit to Alice Redmond-Neal of Access Innovations, Inc. for the example). Thus, the relationship is the same in either direction:

     Automobile industry MAN-PRO Cars, and
     Cars MAN-PRO Automobile industry

It is not completely semantic, with the directional details missing, but this may be good enough for your purposes. After all, it should be obvious which is the manufacturer and which is the product. Therefore, taxonomy/thesaurus software that provides most, if not all, features of an ontology may be sufficient, too.

What matters is serving your needs. Rather than calling it an “ontology” when it does not meet all the definitions of an ontology (and causing confusion or disagreement), it may be safer to say your sophisticated taxonomy “has features of an ontology.”

Friday, October 19, 2012

Taxonomies for Multiple Kinds of Users

This week, I again attended the annual Taxonomy Boot Camp conference held in Washington, DC, the only conference dedicated to taxonomies. The main theme I came away with this year is that taxonomies serve diverse audiences and users.

The theme of different users was best exemplified in a session dedicate to comparing taxonomies for internal and external use. Representatives from Johnson Space Center (JSC), Astra-Zeneca, the Associated Press (AP), and Sears gave examples in panel “Representing Internal and External Taxonomy Requirements in a Taxonomy Model,” moderated by Gary Carlson. While still remaining connected, internal and external taxonomies not only have different terms for the same concept but they may also have different structure. According to Joel Summerlin of AP, internal taxonomies can be more specialized and complex than external taxonomies, and internal taxonomies need to support greater precision in retrieval results, whereas external taxonomies need to support greater recall.

Even within either the internal or external users of a taxonomy, there is great variety. But unlike the situation of internal and external taxonomies, where you can have different taxonomies linked together, you will have a single taxonomy serving a diverse audience. The use of taxonomy features of polyhierarchy and nonpreferred (aka synonym) terms can help diverse users with different vocabularies, perspectives, and approaches find their way to the desired content.

In the session on internal and external taxonomies, the diversity of internal users was mentioned by Sarah Berndt as a characteristic of JSC. In another session, Helen Clegg described the process of building an enterprise taxonomy at the consulting firm AT Kearney, which has employees in different countries and in different industry specialties. As for external users, Jenny Benevento of Sears described how the customers of its retail website range widely, from repeat shoppers of clothing to those making one-time purchases of engagement rings to those buying large appliances. From the audience, Paula McCoy of ProQuest commented on the importance of knowing, before planning the indexing, who the users are of its different database products.

Other sessions, such as “Taxonomy & Information Architecture,” also addressed the multiple uses and users of taxonomies. Panelist Gary Carlson explained how different personas are used in designing websites, and that the kinds of things that the user-persona seeks or needs can then become taxonomies or facets.

Overall in various sessions of the conference there was a great diversity of taxonomy types, and thus taxonomy users, described. These included:

Enterprise taxonomies for internal users, with a set of three presentations under the title of “Enterprise Taxonomies in Action”
Public web site taxonomies, as in the case study example of the Consumer Products Safety Commission and additional examples from in the keynote.
Retail ecommerce taxonomies, as in the example of Sears and additional mentions of Target and REI in other presentations.
Taxonomies used in for article indexing and then retrieval by library patrons of periodical/reference databases, as described in a presentation about Proquest.

Not only may the same taxonomy be targeted at different users at once, but also different users over time. In the closing keynote, Patrick Lamb observed that taxonomies can further add value when we make them available for re-use.

Finally, the conference itself attracted a diverse audience: taxonomists, information architects, data warehouse managers, search specialists, knowledge managers, and others; those from corporations in all industries, government, and nonprofits; and those both new to and experienced with taxonomies. In fact, it’s rare that you would find such a diverse audience at a professional conference. They are united in their need to make information findable, and they understand the value of taxonomies to make that happen.

Tuesday, October 9, 2012

Text Analytics and Taxonomies

What does text analytics have to do with taxonomies? Not so much, I had previously assumed, other than serving a similar objective of information retrieval. After all, text analytics is known as a natural language processing technology designed to obtain meaning for text without the traditional process of indexing to a taxonomy. At the recent Text Analytics World conference in Boston October 3 and 4, however, I learned that text analytics is much more and that the ties between text analytics and taxonomies are greater than I assumed.

The concept of text analytics is used more broadly than I realized, and, as defined in the opening keynote given by conference chair Tom Reamy, encompasses:

Text mining, based on natural language processing, statistics, and machine learning
Entity extraction, semantic technology that enables "fact extraction”
Sentiment analysis, comprising various method to look for positive and negative words
Auto-categorization, which is often rules-based

I was a presenter at this conference, and since I always talk about what I know, which is taxonomies, I endeavored to make a connection between taxonomies and text analytics. But to my surprise I was not the only one talking about taxonomies at Text Analytics World. Two other presentations featured “taxonomies” in their titles thus comprising with mine a half afternoon “Text Analytics and Taxonomies” track. Furthermore, the subject of taxonomies was central to four other presentations and mentioned in a couple others.

My presentation, "Taxonomies for Text Analytics and Auto-Indexing," described how text analytics can be used with auto-categorization and taxonomies to achieve relatively high quality automated indexing results. Auto-categorization is a type of automated indexing that tends to make use of taxonomies, as categorization requires categories (taxonomy terms). Text analytics can be used as a technology to generate meaningful terms from texts, which in turn can be used auto-categorize content against a pre-existing taxonomy. Auto-categorization typically involves technologies of either complex rules to match terms or algorithms and machine learning. In either case, the terms picked up in auto-categorization would be more meaningful if they were first extracted with text analytics technologies based on natural language processing.

Another presentation looked at a different side to the relationship taxonomies and text analytics. Text analytics is also used as means to build taxonomies in the first place, by providing suggested terms that a taxonomist can then edit. Edee Edwards and Rena Morse of Silverchair Information Systems presented a case study on using text analytics to generate terms for taxonomy development. It required multiple iterations and refinements.

Other presenters on the subject of taxonomies and text analytics included the following:

Heather Edwards of the Associated Press explained how AP classifies the news using a custom-build taxonomy and rule-based auto-classification system.
Evelyn Kent of MCT SmartContent also presented how news items are classified using a “context-based language” (taxonomy), and even demonstrated how the taxonomy is managed in the taxonomy tool (SmartLogic Semaphore Ontology Manager).
Anna Divoli of Pingar presented survey results of taxonomy user interface preferences from cases that involved automatically generated hierarchical and faceted taxonomies.
Alyona Medelyan also of Pingar discussed “controlled indexing” in her case study, which featured results of comparing human versus automated indexing (using machine learning and training sets) using the same taxonomy (the Agrovoc agriculture thesaurus of the FAO).
Sarah Ann Berndt of the Johnson Space Center spoke about “automatic generation of semantic markup” in a presentation that turned out to be mostly about the application of a taxonomy.

The subject of taxonomies had also come up in the opening keynote. Tom Reamy described three themes in text analytics: big data, sentiment analysis of social media, and enterprise text analytics. In all three areas he mentioned taxonomies. In the area of text mining and big data, text analytics can serve as a semi-automated taxonomy development. In sentiment analysis, new kinds of taxonomies are being developed for emotional sentiments. In enterprise search, text analytics bridges the gap between taxonomies and documents.

Even if text analytics and taxonomies are combined in different ways, what is common is that combining techniques, tools, and technologies in more challenging situations achieves better results. Techniques, tools, and technologies in this field do not have to compete, but can complement each other.

Wednesday, September 12, 2012

Mentoring Taxonomist Program

In my last blog post, I discussed the need for mentoring taxonomists and mentioned that I had volunteered to lead the new mentoring committee of the Taxonomy Division of SLA (Special Libraries Association) and establish its mentoring program (http://taxonomy.sla.org/get-involved/mentor). While some of the mentoring activities are available to members only, other mentoring services can involve anyone, so I will describe them here.

Frequently Asked Question Resources

In many cases, those new to taxonomies simply have questions about the taxonomy field. Therefore, the initial and primary activity of the SLA Taxonomy Division’s Mentoring Committee has been to develop a detailed list of Frequently Asked Questions (FAQs) and answers, which total 35 to date.

The issue as to whether the answers should be a service to Taxonomy Division members only or to public was resolved by having short answers of 1-3 sentences for the public, and longer answers of 150 – 250 words on separate web pages accessible to members only with their login. (Members also have the ability to submit additional questions to the FAQs.) The FAQs with the short answers are available under the Mentoring section of the public website: http://taxonomy.sla.org/get-involved/mentor/taxonomy-faqs

Mentor and Protégé Directories

Connecting aspiring taxonomists (whom we are calling protégés) with experienced taxonomists, who volunteer to be mentors, is another objective. While it is neither practical nor feasible for the Taxonomy Division to provide direct individual mentoring services nor match mentors to protégés, it can act as a clearinghouse in providing directories on its web site of both willing mentors and interested protégés. In the past few months, I have set up both a Mentor Directory and a Protégé Directory, and it is not required that people be listed in one directory in order to contact those listed in the other directory.

Mentor Directory

Access to mentors is, as expected, a membership benefit. Thus, the Mentor directory is accessible by membership login only. Mentors are SLA Taxonomy Division members with considerable experience in some aspect of taxonomies and are willing to volunteer limited time in mentoring for the benefit of their professional growth and prestige. Mentors listed in the Mentor Directory:

should be available for answering specific individual questions about the taxonomy field, education/training, and job prospects, which the general FAQs cannot answer.
probably could help out a protégé who brings his/her own project
most likely do not have projects to offer in an internship type of relationships (but might)

Protégé Directory

Taxonomy Division members who have had at least some training or exposure to taxonomies and would like to gain the benefits of mentoring may list their names in the Protégé Directory, which is displayed on the website:
http://taxonomy.sla.org/get-involved/mentor/directory-of-proteges

Protégés seeking a mentoring relationship could be for taxonomy projects in either of the following two scenarios:

The protégé is looking for a temporary internship or training arrangement, expecting lower than average pay or no pay in exchange for (1) the opportunity to work without prior experience, (2) useful feedback from the supervisor-mentor, and (3) the ability to use the supervisor-mentor as a future work reference.
The protégé has a pending or existing taxonomy project (whether at work, a freelance project, or a volunteer project) and is seeking advice on aspects of the taxonomy design and/or feedback on initial taxonomy work.

Responses to either of these two kinds of mentoring possibilities are still expected to be relatively low, so the Taxonomy Division is permitting nonmembers who can mentor to contact listed protégés. In the case of the first scenario in particular, many qualified taxonomists who are willing to mentor, simply don’t have suitable projects or company legal permission to bring on temporary interns or subcontractors at below-market rates. Non-profit organizations, though, are more likely to have arrangements for volunteers.

Therefore, if you are looking for a taxonomist intern whom you are willing to mentor, check out the Protégé Directory. If you are looking to be mentored, then join SLA and its Taxonomy Division and list yourself in the directory.

Wednesday, August 22, 2012

Mentoring Taxonomists: The Need

As explained in Chapter 2 of my book on an introduction to taxonomy creation, The Accidental Taxonomist, the majority of taxonomists did not intend to be taxonomists, and they come to the field by accident from various backgrounds. What this means is that most people who find they want to or need to do work as taxonomists are already into their careers and are no longer students with access to full courses. Workshops through conferences or continuing education programs (such as the workshop I teach) are certainly very helpful, but they are of limited duration and not always available. Thus, on-the-job training or mentoring is the most likely way that many people learn how to design and create taxonomies. Just look at the LinkedIn resumes of many practicing taxonomists, and you will see that the education of the majority of them was not in library and information science but in some other field, and that through a series of jobs somehow along the way they learned taxonomy skills on the job.

Another reason why on-the-job training or mentoring is important in the taxonomy field is that taxonomy work is often quite specialized for a particular application. Taxonomies for website navigation, for ecommerce, for supporting an auto-categorization tool, for supporting human indexers, for digital asset management metadata, or for content management systems are not the same and have nuanced differences in their design aside from any subject matter differences. Taxonomy “standards” are actually just guidelines which allow flexibility. Thus, on-the-job training can be more relevant than the theoretical study of taxonomies or than a continuing education workshop that must take a generic approach to accommodate diverse students.

Not everyone is fortunate to have on-the-job training or senior colleagues or supervisors who can act as mentors. I had this opportunity, though, and in retrospect, it was the defining point in my career: the period of about three years when I worked at what was then Information Access Company, first in collaboration with and then as new member of the vocabulary management department. I got the vocabulary manager (aka taxonomist) position, as an inside hire familiar with the controlled vocabularies as an indexer, but I subsequently learned best practices for taxonomy editing and management from my senior colleagues, my supervisor, and also from a visiting consultant.

Due to the nature of the field, though, it is not unusual for the new taxonomist be the sole person responsible for taxonomies in an organization and thus lack the support of coworkers with any experience in taxonomies. The new taxonomist must then look elsewhere for mentoring support. Online discussion groups can provide some support in answering simple questions, as long as the assistance does not require anyone else to actually look at the work. A hired taxonomy consultant can also serve as an excellent mentor if you structure the relationship in that way, although this may not be in your budget. Another place to turn for mentoring assistance could be professional associations.

Thus, I accepted when asked last year if I would volunteer to lead the new mentoring committee of the Taxonomy Division of SLA (Special Libraries Association), a professional membership association to which I belong. Saying that I support mentoring and actually trying to create and foster a mentoring program, however, are quite different matters. The Taxonomy Division chair at the time suggested creating a list of FAQs and answers on the member website as the primary means of mentoring members. While FAQs are a useful resource, this is not what I had in mind for mentoring. Connecting aspiring taxonomists (protégés) with experienced taxonomists who volunteer to be mentors would be ideal. Whether this is an achievable ideal or not still waits to be seen. For now, I have set up the structure of the mentoring programs, as described on the SLA Taxonomy Division website. Now, we just need to encourage participation. My next blog post will describe the program in more detail.

Saturday, July 28, 2012

The Accidental Taxonomy Consultant

It’s well known that most taxonomists become taxonomists by accident, as the title of my book attests. As I look back on my career, I see this progression continuing one step further in accidentally becoming a taxonomy consultant.

Not all consultants are accidental, though. Bright college graduates in the social sciences with strong analytical skills are often attracted to entry level jobs at consulting firms. They then pick up technical consulting skills by practice over time, and these could even involve taxonomy work. As such, they are not accidental consultants, but they may become accidental taxonomy consultants.

Those who are already taxonomists, as myself, often end up as consultants, because that’s where they find the work. Full-time taxonomist jobs are still relatively rare and are often not in one’s geographical location. So, if an experienced taxonomist loses a job due to a layoff or relocation, and looks around and cannot find another conveniently located taxonomist job, consulting becomes an option. Employers of full-time taxonomists tend to be limited to either certain industries (publishing, media, ecommerce, etc.) or to very large companies in any industry with large internal content management needs, but then the taxonomist job is only at their headquarters location. However, companies of all industries and various medium to large sizes have taxonomy needs and can often afford a taxonomy consultant on a temporary project if not a full-time staff member. Thus, taxonomy consultants are in greater demand than are full-time employed taxonomists.

In seeking to contract a taxonomy consultant, you may wonder whether it is better to hire a consultant-turned-taxonomist or a taxonomist-turned-consultant. If you hire a skilled taxonomist who is less experienced in consulting, you ought to get a good taxonomy, although the process might not be that smooth. More likely, though, the experienced taxonomist who is inexperienced in consulting will not likely make as good a first impression and sell the services as well as professional consultant. The professional consultant-turned-taxonomist will provide a better project experience, although the end-result taxonomy may not be as good. If you can plan and manage the project yourself, then it is the experienced taxonomist you want, but if you want the entire project managed by a consultant, you need a good consultant.

You might not have to compromise, though. A senior enough consultant could be sufficiently skilled in both consulting and taxonomies, that the career sequence does not matter. If you can afford to hire a firm or partnership, or even a consultant with subcontractors, you may not need to make the choice of experience either, because you can hopefully get some of each on the consultant team serving you. That’s why you should look at the resumes of each member of a consulting team, to ensure that at least one member has very solid taxonomy experience, while at least another member has considerable consulting and project management experience.

Among the things I have learned about consulting is that it helps to have standard consulting processes and procedures, including standard questions that the consultant should ask the client at the very beginning of a project to clarify the scope and understand the context. Consulting firms may additionally have standard deliverables, reports, etc. But in the particular field of taxonomy consulting, the variables are too great, and standard deliverables rarely fit.

There are a lot of books on consulting, but none about taxonomy consulting. When I came across a potential title, Information Consulting: Guide to Good PracticeI (Chandos Publishing, 2011), I found that even this book addressed consulting more generally, and when it occasionally discussed “information consulting” it was more about the work of independent research librarians. So, accidental taxonomy consultants lack written guidance that is just for them.

This is my story. I became a taxonomist by accident. Then after getting laid off, more than once, I became a taxonomy consultant by accident. Then I joined a consulting company of intentional consultants, some turned taxonomy-consultant by accident, but I did not feel I fit in with them or their choice of projects, since I was a taxonomist first. So, I recently chose to go on my own again as an independent consultant or partnering with another on a case-by-case basis.

Saturday, July 7, 2012

Deviating from Taxonomy Standards

In my last blog post, I suggested that enterprise taxonomies need not follow the standards for controlled vocabularies and thesuari (ANSI/NISO Z39.19 guidelines and ISO 25964-1) to the same extent as “traditional” discipline taxonomies and thesauri. I say this cautiously, though. Standards should not be ignored for any taxonomy, but rather followed in general, and any deviations made should be for good reason. Enterprise taxonomies (taxonomies custom-designed for the content and users of a specific enterprise, and for the entire enterprise) and also ecommerce taxonomies (taxonomies of products for sale) often have good reasons to deviate from standards in certain areas.

Hierarchical Relationships
An important part of the taxonomy standards are the criteria for creating hierarchical relationships. Hierarchical relationships should be one of three types: generic-specific, generic-instance, or whole-part. Any other relationship among posted/displayed terms is not hierarchical, but rather associaciative. A “good reason” to relate terms hierarchically even when they do not exactly meet the criteria, is when the pair of terms are clearly related, but the taxonomy does not include any associative terms. Enterprise and ecommerce taxonomies often are simple hierarchical taxonomies and do not support associative relationships common in standard thesauri. For example, the following two hierarchies are not correct by the standards, but the first may be acceptable in an enterprise taxonomy and the second in an ecommerce taxnoomy:

Information Technology
> Telecommunications
> > Cell phones

Cameras

> Camera accessories

Plural/Singular
The standard is to use plural for terms that are countable nouns. The idea is is that when users select a term they will find multiple documents, records, or digital assets (in plural) indexed with or categorized by the term. Enterprise and ecommerce taxonomies, however, tend to be comprised of multiple taxonomy facets, whereby the user selects terms from a combination of facets. Taxonomy terms within facets then appear to user to be filters, scopes, aspects, or attributes, rather than simply a category of plural objects. For example, a document type facet might have terms in the singular describing the type of document: Article, Report, Form, Application, Interview, etc., all in the singular to answer the question “what kind of document.” The names of the facets themselves may also be in singular, rather than plural, so as to “limit by” a facet, such as: Document type, Location, Topic, Department, etc.

Compound Terms
The standards present criteria to consider in retaining or breaking apart compound terms. For example “A compound term should be split when its focus refers to a property or part, and its modifier represents the whole or possessor of that property or part.” (ANSI/NISO Z39.19-2005 section 7.6.2.1) While such guidelines are useful and certainly within the scope of taxonomy design, the highly customized nature of enterprise or ecommerce taxonomies obviate following such guidelines for compound terms. ANSI/NISO gives the example of aircraft + engines rather than aircraft engines, but aircraft engines, or other such compound terms, would be perfectly acceptable in an enterprise or ecommerce taxonomy. It is worth noting that both the ANSI/NISO and ISO standards state that these criteria are just guidelines and do not have to be strictly followed.

An enterprise or ecommerce taxonomy can be a challenge to create. Just because adherence to taxonomy standards may be less strict for a corporate or retail taxonomy than it is for a subject/discipline taxonomy, should not suggest that it is easier to design or that non-trained taxonomists can design it. Only with a good understanding of the standards would one know when and where it is acceptable not to adhere to a specific guideline.

Sunday, June 24, 2012

Enterprise Taxonomies vs. Traditional Taxonomies

A book that I have been reading (Structures for Organizing Knowledge: Exploring Taxonomies, Ontologies, and Other Schemas, by June Abbas, 2010) got me thinking about the comparison between corporate/enterprise taxonomies and other “traditional taxonomies”. I found it intriguing that Abbas presents corporate or “professional” taxonomies in the same chapter on personal information structures. Thus, a corporate taxonomy could more aptly be an extension of a personal knowledge organization system, rather than the customization of standard taxonomy or controlled vocabulary. So, how are corporate taxonomies or enterprise taxonomies (corporate taxonomies that are specifically for use enterprise-wide) different from traditional (library science type) taxonomies or thesauri?

There are, in fact, multiple ways in which a corporate or enterprise taxonomy differs from the traditional taxonomies or controlled vocabularies used in libraries or in particular subject disciplines. Enterprise taxonomies in particular are:

1. Relatively small in size

2. Multifaceted

3. Customized to an enterprise’s content

4. Customized to an enterprise’s users

5. Relatively informal

Size
An enterprise taxonomy tends to be relatively small in size with respect to the number of terms and depth of term levels. The size will depend largely on the complexity of an enterprise’s business (number of lines of business, for example), but the range of 1000-2000 terms in an taxonomy for an enterprise that has single line of business is typical. An organization may certainly supplement this enterprise taxonomy with additional subject-specialized controlled vocabularies, particularly in the areas of research & development or product catalogs.

Faceted Nature
An enterprise taxonomy deals with a variety of content which is differentiated in more than one way, not just by subject matter. Content is typically organized and searched not merely for what it is “about” but also what its purpose is, what its source is, what type of content it is, and perhaps also for what market or customer type it is relevant. Thus, an enterprise taxonomy is usually organized into several facets to support faceted search or faceted browse (see my April 2012 post), which include: document type, file format, department or functional area, line of business or product/service category, geographical region, and market segment, in addition to a topical facet.

Content Customized
A corporate or enterprise taxonomy should be highly customized to an enterprise’s own unique content. While two companies in the same industry may have nearly identical products and services, their customer or member base could vary slightly, and they probably do not have identical organizational structures, procedures, and workflows. Thus, no two companies or organizations would have identical content. Organizations also differ in the quantity of different kinds of content they own and in the importance they assign to different types of content.

User Customized

Just as important as content-customization is user-customization. Corporate or enterprise taxonomies are designed to help an organization’s users (employees, and often also partners and customers) find content. Users include both those who upload/publish content to the intranet or content management system, often manually tagging it, and users who are looking for content. These are sometimes the same people and sometimes not. Also in consideration of the users, there may be a workflow or business rule aspect that is taken into consideration. Thus, the process of designing an enterprise or corporate taxonomy involves gathering input from users, via interviews and workshops. For this reason, the author Abbas has combined corporate taxonomies into the same chapter as personal taxonomies, because they are both highly user-centered.

Informal

Traditional discipline taxonomies (such as for living organisms), thesauri, book cataloging and classification systems follow industry standards for their design and construction, which can be quite rigid and formal. For general-purpose controlled vocabularies, there are the ANSI/NISO Z39.19 guidelines and ISO 25964-1 standard (see my March 2012 post), which allow more flexibility than library cataloging rules. The design of corporate or enterprise taxonomies should adhere to ANSI/NISO or ISO standards at a high level, but in practice, other practicalities and user needs and expectations should take precedence over a strict following of every detail of the standards.

Monday, May 28, 2012

Digital Asset Management and Taxonomies

Earlier this month I attended a conference on digital asset management (DAM) for the first time: Henry Stewart DAM in New York, May 10-11. It revealed to me that the field of digital asset management is definitely an area where taxonomies are being applied and could be more even extensively utilized.

“Digital assets” refers to digitized content generally of images, video, and sound recordings, but could also be copyright text of publishers. As one speaker mentioned, digital assets are the intellectual property of certain enterprises, and hence the designation “assets.” The typical industries concerned with DAM are publishers, broadcasters, advertising (creative) agencies, and other media companies, which manage vast collections of media files. Additionally, large enterprises in any industry whose corporate communications departments manage sizeable collections of image or multimedia files are also concerned with DAM. The New York venue of this conference drew heavily on representatives of local media and advertising industries, but the annual fall venue of the same conference in Chicago, I am told, has a more diversified participation. The field is additionally defined and driven by vendors, digital asset management software products.

DAM is also a growing field. The 2012 Henry Stewart DAM conference in New York, its ninth year, drew an attendance of approximately 500, up from 400 the previous year. Last year, a new professional association was founded, the Digital Asset Management Foundation. A new quarterly journal from Henry Stewart Publications, Journal of Digital Media Management, just published its first issue this month. Also this month, the DAM Foundation and independent analyst firm, The Real Story Group, released a DAM Maturity Model, which provides a structured framework to address DAM implementation challenges.

As to where taxonomies fit into DAM, it’s not difficult to see. Digital assets tend to be structured content with various metadata fields (subject, purpose, format, location, copyright), which DAM software supports. Taxonomies (or more correctly, any controlled vocabularies) enable the consistent application of descriptive metadata. DAM software supports the inclusion of controlled vocabularies, but the tools to and especially the know-how to build the best controlled vocabularies/taxonomies is often lacking. Meanwhile, standard text search does not work on the non-text content that is typical of digital assets, so tagging and controlled vocabularies are all the more important.

DAM experts and consultants are not necessarily experts in taxonomies, and taxonomy experts may not be familiar with DAMs, so there is some learning for all of us. DAM systems, like other content management systems, often need to be configured, integrated, and customized for a specific enterprise’s use, with expertise and time spent first on system integration, pushing taxonomy design out to perhaps only an afterthought.

Taxonomies have various applications. I have been involved in taxonomies that tend to be either: (1) external facing, to allow customers or clients to search for content published by an organization, whether for research or for e-commerce, and (2) internal, as an enterprise or business taxonomy, to allow employees to find content within an intranet or enterprise content management system. A digital asset management system can manage content for either internal or external users, or often both at once. As such, designing DAM taxonomies often needs to take into consideration more varied users of the content. This is certainly an exciting growth area for taxonomies, and I hope to be more involved in DAM taxonomy projects in the future.

Thursday, April 12, 2012

Faceted Search vs. Faceted Browse

If you have considered different kinds of taxonomies, you have undoubtedly come across the faceted type. You can remember what a facet is by thinking of “face,” as in a multi-faceted diamond. Other names for facet include dimension, aspect, or attribute. It could be the set of characteristics that describe a product (category, size, color, price, intended user, etc.), an image (thing, persons, location, occasion, etc.), or a document (document type, topic, author, source, etc.). In a business or enterprise taxonomy, facets for content management may include content type, product or service line, department or function, and topic. Named entities, such as person names, company names, agency names, and names of laws might also each be a facet. Facets allow users to limit, restrict, or filter results by chosen criteria, one from each facet, that are combined in any order.

Are “faceted browse” and “faceted search” the same? These designations are often used interchangeably, and until recently I had not considered a difference, preferring to use the terminology of my client. Yet “browse” and “search” are clearly not the same thing. To browse is to skim or scan a displayed list of taxonomy terms, whether arranged alphabetically, hierarchically, or a combination. To search is to enter search terms into a search box (which may then be matched against a controlled vocabulary for more accurate results). The implementations of facets in a user interface vary greatly, so perhaps the different designations of “faceted browse” and “faceted search” should reflect these different implementations.

One implementation of facets is to allow the user to dynamically restrict, filter, or limit a data set , based on selecting values from each of multiple facets that are displayed, typically in the left-hand margin, while references to the data or content is displayed in the main screen area. Under each named facet are displayed the names of values (taxonomy terms) within the facet. Facets may need to be expanded to display all values under each, or there may be scroll bars of terms. This implementation of facets can be considered “browse” because the user browses the displayed facets and the displayed terms within each facet.

The data set that is filtered by the facets could be the entire set of content, but more likely it is a subset, based on a prior execution of either a category selection or a search. If the user’s first step was to initiate a search to obtain search results, and then uses facets to limit the search results, this might be called “faceted search.” Even though the user browses the facets, because the facets are introduced as a second step following search, this step might be called “faceted search.” If, however, the user’s first step was to browse subject categories and select a category to obtain the initial data set, then the use of facets in the second step would more likely be called “faceted browse.” I would consider it better practice to call the process “faceted browse” in either case, regardless of how the initial data set was obtained. However, if it’s less confusing to the users, I will defer to those who prefer to call this process “faceted search.”

Another implementation of facets is to allow the user to select among limiting criteria from the beginning, without first selecting a subject by browse or search. In order to achieve usable results (result sets that are not too large), the facets need to contain relatively large taxonomies: a large number and deep set of terms. While it is certainly possible to display a large taxonomy for browsing, it may be difficult to display multiple large, browsable taxonomies, one for each facet. Therefore, if facets are made available to the user from the start (without first requiring the user to select a limited data set based on a search or browse selection), it is more likely that that not all the facets will display the terms to the user. The user must then execute a search within a facet. This would correctly be called “faceted search.” It is also known as “fielded search” or “advanced search,” as a search field/box is made available for each facet “field.”

The distinction between faceted browse and faceted search is lost, however, where the distinction between browsing and searching is becoming blurred. Newer user interface implementations of taxonomies are combining search and browse, so that the difference is no longer as obvious. For example, I have seen cases where there is a search box, and as the user types in something, a type-ahead feature matches the search string against controlled vocabulary terms, which are displayed in a short list under the box, and the user can browse the list to select a term. I have also seen a case where a user may be presented with a search box to enter search terms, and there is a button next to the search box, which the user may optionally click, and then the search box becomes a scroll box to view and browse the entire controlled vocabulary for that field. When these kinds of advanced taxonomy-enhanced search boxes correspond to facets, the distinction between “faceted search” and “faceted browse” truly no longer exists.

Friday, March 16, 2012

Taxonomy Standards

I’ve written book reviews before, but recently a journal asked me to review a standard. It was ISO 25964-1 Thesauri and interoperability with other vocabularies, Part 1: Thesauri for information retrieval, which was published in 2011 by the International Organization for Standards. I was pleased to have the opportunity, because this way I obtained a copy which otherwise costs about US$260 (or whatever the current exchange rate equivalence of 238 Swiss Franc). Most taxonomists in the United States and beyond have some familiarity with the U.S. standard ANSI/NISO Z39.19 Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies, not merely because it is American, but because the PDF document is freely available from the National Information Standards Organization (NISO).

So, how do the two standards compare? They are very similar is style, format, level of detail, use of illustrative examples, suitability of a reference, etc. Although explanations are not identical, it is clear that there was some degree of cooperation, consultation, or at least communication among the author teams of each. The differences between the two standards are in their scope, and that is obvious from their titles. The ISO standard covers both monolingual and multilingual thesauri in a single standard, whereas the ANSI/NISO standard takes up multilingual vocabularies in a separate document. Additionally, the ISO standard focuses on thesauri, leaving other types of vocabularies in the yet-to-be published part 2 document, whereas the ANSI/NISO standard covers all kinds of controlled vocabularies within a single standard publication.

There are implications with these differences. By combining guidance on multilingual in addition to monolingual thesauri in a single document, monolingual taxonomists who read the ISO standard will broaden their awareness of the uses and possibilities of multilingual taxonomies, and that’s a good thing. On the other hand, a standard that appears from its title to be just about “thesauri” is likely to be overlooked by taxonomists who work with other kinds of controlled vocabularies.

The importance of the standards should not be overlooked. Taxonomies are only useful if they are well constructed, and decades of experience, practice, and use have indicated the conventions by which the most usable and useful taxonomies should be built. In addition to prescribing what works, the standards also encourage consistency. Consistently designed taxonomies thus become familiar to users, who then know how to use them with minimal training. Users don’t have to be told what a narrower term is and where to find it, or what a related term is and what its purpose is.

Taxonomy or thesaurus standards are a particularly useful resource to taxonomists. Other information management standards (such as for cataloging, indexing, bibliographic citations, etc.) have been reproduced, republished, disseminated, etc. by numerous professional organizations, nongovernmental institutes, educational institutions, and in numerous books. There is no need for the average information professional to look up the original, primary source standard. Taxonomy construction, however, is not such an established discipline or activity. In the field of taxonomies, professional membership organizations are lacking (except for divisions or special interest groups of larger organizations), academic courses are merely nonstandard electives, and books are fewer. The nature of the free-for-all style of the web, which is the platform for most taxonomies today, also poses challenges to conformity in style. Therefore, there is in fact a greater need for the average taxonomist to consult the original, primary source of standards.

For most individual taxonomists, I would suggest that the ANSI/NISO standard is sufficient, and there is no need to also read the ISO standard. However, for an organization or enterprise engaged in taxonomy building and implementation, the additional ISO standard is probably a good investment. Finally, any taxonomist involved in teaching or consulting would also find the ISO standard a valuable additional resource.

Sunday, February 26, 2012

Business Taxonomies

It’s difficult enough for professionals to come to a consensus on the definition of “taxonomy.” As for “business taxonomy,” it’s even worse. There are varying ideas of taxonomy, varying ideas of “business,” and varying ideas on what the connection should be, in addition to the scope and purpose. Is it a taxonomy used by a for-profit enterprise? Is it a taxonomy of business processes for use in any enterprise? Is it the same as an “enterprise taxonomy”?

Just as the term “taxonomy” has both a specific and generalized meaning, so does the term “business taxonomy.” The specific meaning of a taxonomy is a controlled vocabulary of concepts (terms) that are organized into a hierarchy, based on hierarchical relationships (broader/narrower, parent/child, group/member, superordinate/subordinate) between the terms. The generalized meaning of taxonomy is any kind of controlled vocabulary or sets of controlled vocabularies (whether structured as lists, hierarchies, facets, thesauri, etc.) to support the organization and findability of content. The specific meaning of a business taxonomy, is a taxonomy that is specific for business use by dealing with business functions and processes. The generalized meaning of a business taxonomy is any taxonomy used by a business/enterprise, as opposed to a scientific discipline, to organize and manage its content.

I would caution that a taxonomy designed to define and describe business process and functions may not have the same objectives as the more common taxonomies whose purpose is to support the organization and findability of indexed content (documents, files, digital assets, etc.). In fact, even the term “taxonomy” in its purest sense does not mean that it has to be used for content management. The original taxonomies, such as the Linnean taxonomy of animals, plants and other organisms, were not designed for indexing and searching content associated with each concept in the taxonomy. Similarly Bloom’s Taxonomy of educational concepts is not for indexing educational content but rather to define the scope of educational objectives. Thus, a taxonomy could be just for classifying its term/concepts/members for sake of better understanding of its members and their relationships. In this way, a business taxonomy, in its more specific meaning, with the focus on functions and processes, could serve the purpose creating a better structure of an organization and improving business processes. The users of this kind of business taxonomy are the officers and managers of an organization with a goal of improving overall management, rather than all content users.

Furthermore, the business functions/process taxonomy can be more generic, and the same taxonomy, such as a Sales, General & Administrative (SG&A) taxonomy, with modifications, could be used by different organizations. In contrast, a taxonomy for content management and retrieval, especially when it is product/service-focused, should be custom-designed and developed to reflect the nature of the content and the goals of its users. The larger an enterprise is, the more unique its particular business mix and content is. That’s why the largest enterprises tend to have taxonomists on staff.

Yes, the more generic “business taxonomy” and “enterprise taxonomy” are terms often used interchangeably. However, I prefer it when the term “enterprise taxonomy” is used to mean specifically a taxonomy (or set of inter-related taxonomies) that is intended for use enterprise-wide. This is an important designation, because within an enterprise, taxonomies are often siloed. Integrating them and designing a unified taxonomy that cuts across all departments to support the broadest sharing of content across the enterprise is an important goal of an “enterprise taxonomy.”

The term “taxonomy” might sound too technical, scientific for business owners and managers who don’t understand exactly what it is or what it can do. Calling it a “business taxonomy” is sometimes a sort of marketing technique of taxonomy consultants to suggest that a taxonomy is something standard for businesses and something the business needs. It often works, but ultimately the term “business taxonomy” has resulted in confusion as well.

Friday, February 3, 2012

Taxonomy Training Workshops

I give a workshop in creating taxonomies in two formats, full-day in person and online. The question sometimes comes up from prospective participants as to the differences. Since a full-day onsite workshop is coming up soon, this would be a good time to address the similarities and differences.

Both workshops cover essentially the same content with a similar outline. Some of the examples are the same, and the participant exercises are the same, too. The workshops address the same diverse audience, comprising the range from quick-learning beginner who has at least a background in information science to someone already experienced in creating taxonomies but within a limited context and seeks to broaden those skills to more applications. In both kinds of workshops, the audience is also diverse in its professional backgrounds: librarians, corporate content managers and knowledge managers, indexers, web usability professionals and information architects; from industry, academia, nonprofits, and independent professionals. With such a wide diversity of backgrounds, the online workshop seems to resonate a little better with participants, none of whom then feels like a minority in a classroom of other types.

There is an organizational difference, whereby the outline of the onsite PowerPoint-based workshop has 10 topics, and online workshop comprises 5 weekly lessons: (1) an introduction of examples and applications, (2) software for creating taxonomies, (3) hierarchical and associative relationships, (4) preferred term wording and nonpreferred terms, and (5) miscellaneous topics of project processes, governance, folksonomies, and taxonomy jobs. Two onsite workshop topics may be covered in one weekly online lesson, although the onsite workshop does have the additional topics of the sources for terms and the comparison hierarchical taxonomies with alphabetical indexes (when presented as a pre-conference workshop for the American Society for Indexing). The order of topics is also different. The online workshop introduces software earlier on, so students have the option of using trial software to apply principles learned in later lessons.

The use of software is a significant difference in both workshops. In the onsite workshop, I give demos of Synaptica and Data Harmony Thesaurus Master, both web-based, and the PC software MultiTes. In the online workshop, participants access the demo software themselves, with the additional option to download the trial Mac software of Cognatrix (which I don’t demonstrate in my onsite workshop, since I don’t use a Mac.) Obviously, you can learn more when you try out the software yourself. Trial versions of MultiTes and Cognatrix are available to the public, but trials to Synaptica and Data Harmony are not and are made available by special arrangement for students of the workshop.

Q&A is more dynamic and engaging in the classroom setting. Although the online workshop has discussion forums, there is no simultaneous chat. Although the technology is there, the problem is that for a continuing education workshop this is in addition to everyone’s full time job and personal life. Spread out over different time zones too, it would be too difficult to get an agreeable time of day to chat. In the classroom it’s easier and less inhibiting to raise a question or make a comment. Online, it’s in writing, permanent for the duration of the course, and your name is attached to it. Thus, the online discussion of the workshop has usually been less than optimal.

Then there are the obvious differences. Some people learn better by listening to a speaker, and some people learn better by reading texts on their own. Convenience of location and timing will also make a difference. The onsite workshop is usually offered only once a year (although a customized corporate onsite version is an option), whereas the online workshop is offered every other month and is accessible by Internet globally. However, the latter tends to fill up 2-3 months in advance, and the onsite workshop usually has room for same-day registrations (at a higher cost).