Friday, October 19, 2012

Taxonomies for Multiple Kinds of Users



This week, I again attended the annual Taxonomy Boot Camp conference held in Washington, DC, the only conference dedicated to taxonomies. The main theme I came away with this year is that taxonomies serve diverse audiences and users.

The theme of different users was best exemplified in a session dedicate to comparing taxonomies for internal and external use. Representatives from Johnson Space Center (JSC), Astra-Zeneca, the Associated Press (AP), and Sears gave examples in panel “Representing Internal and External Taxonomy Requirements in a Taxonomy Model,” moderated by Gary Carlson. While still remaining connected, internal and external taxonomies not only have different terms for the same concept but they may also have different structure. According to Joel Summerlin of AP, internal taxonomies can be more specialized and complex than external taxonomies, and internal taxonomies need to support greater precision in retrieval results, whereas external taxonomies need to support greater recall.

Even within either the internal or external users of a taxonomy, there is great variety. But unlike the situation of internal and external taxonomies, where you can have different taxonomies linked together, you will have a single taxonomy serving a diverse audience. The use of taxonomy features of polyhierarchy and nonpreferred (aka synonym) terms can help diverse users with different vocabularies, perspectives, and approaches find their way to the desired content.

In the session on internal and external taxonomies, the diversity of internal users was mentioned by Sarah Berndt as a characteristic of JSC. In another session, Helen Clegg described the process of building an enterprise taxonomy at the consulting firm AT Kearney, which has employees in different countries and in different industry specialties. As for external users, Jenny Benevento of Sears described how the customers of its retail website range widely, from repeat shoppers of clothing to those making one-time purchases of engagement rings to those buying large appliances. From the audience, Paula McCoy of ProQuest commented on the importance of knowing, before planning the indexing, who the users are of its different database products.

Other sessions, such as “Taxonomy & Information Architecture,” also addressed the multiple uses and users of taxonomies. Panelist Gary Carlson explained how different personas are used in designing websites, and that the kinds of things that the user-persona seeks or needs can then become taxonomies or facets.
Overall in various sessions of the conference there was a great diversity of taxonomy types, and thus taxonomy users, described. These included:
  • Enterprise taxonomies for internal users, with a set of three presentations under the title of “Enterprise Taxonomies in Action”
  • Public web site taxonomies, as in the case study example of the Consumer Products Safety Commission and additional examples from in the keynote.
  • Retail ecommerce taxonomies, as in the example of Sears and additional mentions of Target and REI in other presentations.
  • Taxonomies used in for article indexing and then retrieval by library patrons of periodical/reference databases, as described in a presentation about Proquest.

Not only may the same taxonomy be targeted at different users at once, but also different users over time. In the closing keynote, Patrick Lamb observed that taxonomies can further add value when we make them available for re-use.

Finally, the conference itself attracted a diverse audience: taxonomists, information architects, data warehouse managers, search specialists, knowledge managers, and others; those from corporations in all industries, government, and nonprofits; and those both new to and experienced with taxonomies. In fact, it’s rare that you would find such a diverse audience at a professional conference. They are united in their need to make information findable, and they understand the value of taxonomies to make that happen.


Tuesday, October 9, 2012

Text Analytics and Taxonomies



What does text analytics have to do with taxonomies? Not so much, I had previously assumed, other than serving a similar objective of information retrieval. After all, text analytics is known as a natural language processing technology designed to obtain meaning for text without the traditional process of indexing to a taxonomy. At the recent Text Analytics World conference in Boston October 3 and 4, however, I learned that text analytics is much more and that the ties between text analytics and taxonomies are greater than I assumed.

The concept of text analytics is used more broadly than I realized, and, as defined in the opening keynote given by conference chair Tom Reamy, encompasses:
  • Text mining, based on natural language processing, statistics, and machine learning
  • Entity extraction, semantic technology that enables "fact extraction”
  • Sentiment analysis, comprising various method to look for positive and negative words
  • Auto-categorization, which is often rules-based
I was a presenter at this conference, and since I always talk about what I know, which is taxonomies, I endeavored to make a connection between taxonomies and text analytics. But to my surprise I was not the only one talking about taxonomies at Text Analytics World.  Two other presentations featured “taxonomies” in their titles thus comprising with mine a half afternoon “Text Analytics and Taxonomies” track. Furthermore, the subject of taxonomies was central to four other presentations and mentioned in a couple others.

My presentation, "Taxonomies for Text Analytics and Auto-Indexing," described how text analytics can be used with auto-categorization and taxonomies to achieve relatively high quality automated indexing results. Auto-categorization is a type of automated indexing that tends to make use of taxonomies, as categorization requires categories (taxonomy terms). Text analytics can be used as a technology to generate meaningful terms from texts, which in turn can be used auto-categorize content against a pre-existing taxonomy. Auto-categorization typically involves technologies of either complex rules to match terms or algorithms and machine learning. In either case, the terms picked up in auto-categorization would be more meaningful if they were first extracted with text analytics technologies based on natural language processing.

Another presentation looked at a different side to the relationship taxonomies and text analytics. Text analytics is also used as means to build taxonomies in the first place, by providing suggested terms that a taxonomist can then edit. Edee Edwards and Rena Morse of Silverchair Information Systems presented a case study on using text analytics to generate terms for taxonomy development. It required multiple iterations and refinements.

Other presenters on the subject of taxonomies and text analytics included the following:
  • Heather Edwards of the Associated Press explained how AP classifies the news using a custom-build taxonomy and rule-based auto-classification system.
  • Evelyn Kent of MCT SmartContent also presented how news items are classified  using a “context-based language” (taxonomy), and even demonstrated how the taxonomy is managed in the taxonomy tool (SmartLogic Semaphore Ontology Manager).
  • Anna Divoli of Pingar presented survey results of taxonomy user interface preferences from cases that involved automatically generated hierarchical and faceted taxonomies.
  • Alyona Medelyan also of Pingar discussed “controlled indexing” in her case study, which featured results of comparing human versus automated indexing (using machine learning and training sets) using the same taxonomy (the Agrovoc agriculture thesaurus of the FAO).
  • Sarah Ann Berndt of the Johnson Space Center spoke about “automatic generation of semantic markup” in a presentation that turned out to be mostly about the application of a taxonomy.
The subject of taxonomies had also come up in the opening keynote. Tom Reamy described three themes in text analytics: big data, sentiment analysis of social media, and enterprise text analytics. In all three areas he mentioned taxonomies. In the area of text mining and big data, text analytics can serve as a semi-automated taxonomy development. In sentiment analysis, new kinds of taxonomies are being developed for emotional sentiments. In enterprise search, text analytics bridges the gap between taxonomies and documents.

Even if text analytics and taxonomies are combined in different ways, what is common is that combining techniques, tools, and technologies in more challenging situations achieves better results. Techniques, tools, and technologies in this field do not have to compete, but can complement each other.

Wednesday, September 12, 2012

Mentoring Taxonomist Program

In my last blog post, I discussed the need for mentoring taxonomists and mentioned that I had volunteered to lead the new mentoring committee of the Taxonomy Division of SLA (Special Libraries Association) and establish its mentoring program (http://taxonomy.sla.org/get-involved/mentor). While some of the mentoring activities are available to members only, other mentoring services can involve anyone, so I will describe them here.

Frequently Asked Question Resources

In many cases, those new to taxonomies simply have questions about the taxonomy field. Therefore, the initial and primary activity of the SLA Taxonomy Division’s Mentoring Committee has been to develop a detailed list of Frequently Asked Questions (FAQs) and answers, which total 35 to date.

The issue as to whether the answers should be a service to Taxonomy Division members only or to public was resolved by having short answers of 1-3 sentences for the public, and longer answers of 150 – 250 words on separate web pages accessible to members only with their login. (Members also have the ability to submit additional questions to the FAQs.) The FAQs with the short answers are available under the Mentoring section of the public website: http://taxonomy.sla.org/get-involved/mentor/taxonomy-faqs

Mentor and Protégé Directories

Connecting aspiring taxonomists (whom we are calling protégés) with experienced taxonomists, who volunteer to be mentors, is another objective. While it is neither practical nor feasible for the Taxonomy Division to provide direct individual mentoring services nor match mentors to protégés, it can act as a clearinghouse in providing directories on its web site of both willing mentors and interested protégés. In the past few months, I have set up both a Mentor Directory and a Protégé Directory, and it is not required that people be listed in one directory in order to contact those listed in the other directory.

Mentor Directory

Access to mentors is, as expected, a membership benefit. Thus, the Mentor directory is accessible by membership login only. Mentors are SLA Taxonomy Division members with considerable experience in some aspect of taxonomies and are willing to volunteer limited time in mentoring for the benefit of their professional growth and prestige. Mentors listed in the Mentor Directory:
  • should be available for answering specific individual questions about the taxonomy field, education/training, and job prospects, which the general FAQs cannot answer.
  • probably could help out a protégé who brings his/her own project
  • most likely do not have projects to offer in an internship type of relationships (but might)

Protégé Directory


Taxonomy Division members who have had at least some training or exposure to taxonomies and would like to gain the benefits of mentoring may list their names in the Protégé Directory, which is displayed on the website:
http://taxonomy.sla.org/get-involved/mentor/directory-of-proteges

Protégés seeking a mentoring relationship could be for taxonomy projects in either of the following two scenarios:
  1. The protégé is looking for a temporary internship or training arrangement, expecting lower than average pay or no pay in exchange for (1) the opportunity to work without prior experience, (2) useful feedback from the supervisor-mentor, and (3) the ability to use the supervisor-mentor as a future work reference.
  2. The protégé has a pending or existing taxonomy project (whether at work, a freelance project, or a volunteer project) and is seeking advice on aspects of the taxonomy design and/or feedback on initial taxonomy work.
Responses to either of these two kinds of mentoring possibilities are still expected to be relatively low, so the Taxonomy Division is permitting nonmembers who can mentor to contact listed protégés. In the case of the first scenario in particular, many qualified taxonomists who are willing to mentor, simply don’t have suitable projects or company legal permission to bring on temporary interns or subcontractors at below-market rates. Non-profit organizations, though, are more likely to have arrangements for volunteers.

Therefore, if you are looking for a taxonomist intern whom you are willing to mentor, check out the Protégé Directory. If you are looking to be mentored, then join SLA and its Taxonomy Division and list yourself in the directory.

Wednesday, August 22, 2012

Mentoring Taxonomists: The Need

As explained in Chapter 2 of my book on an introduction to taxonomy creation, The Accidental Taxonomist, the majority of taxonomists did not intend to be taxonomists, and they come to the field by accident from various backgrounds. What this means is that most people who find they want to or need to do work as taxonomists are already into their careers and are no longer students with access to full courses. Workshops through conferences or continuing education programs (such as the workshop I teach) are certainly very helpful, but they are of limited duration and not always available. Thus, on-the-job training or mentoring is the most likely way that many people learn how to design and create taxonomies. Just look at the LinkedIn resumes of many practicing taxonomists, and you will see that the education of the majority of them was not in library and information science but in some other field, and that through a series of jobs somehow along the way they learned taxonomy skills on the job.

Another reason why on-the-job training or mentoring is important in the taxonomy field is that taxonomy work is often quite specialized for a particular application. Taxonomies for website navigation, for ecommerce, for supporting an auto-categorization tool, for supporting human indexers, for digital asset management metadata, or for content management systems are not the same and have nuanced differences in their design aside from any subject matter differences. Taxonomy “standards” are actually just guidelines which allow flexibility. Thus, on-the-job training can be more relevant than the theoretical study of taxonomies or than a continuing education workshop that must take a generic approach to accommodate diverse students.

Not everyone is fortunate to have on-the-job training or senior colleagues or supervisors who can act as mentors. I had this opportunity, though, and in retrospect, it was the defining point in my career: the period of about three years when I worked at what was then Information Access Company, first in collaboration with and then as new member of the vocabulary management department. I got the vocabulary manager (aka taxonomist) position, as an inside hire familiar with the controlled vocabularies as an indexer, but I subsequently learned best practices for taxonomy editing and management from my senior colleagues, my supervisor, and also from a visiting consultant.

Due to the nature of the field, though, it is not unusual for the new taxonomist be the sole person responsible for taxonomies in an organization and thus lack the support of coworkers with any experience in taxonomies. The new taxonomist must then look elsewhere for mentoring support. Online discussion groups can provide some support in answering simple questions, as long as the assistance does not require anyone else to actually look at the work. A hired taxonomy consultant can also serve as an excellent mentor if you structure the relationship in that way, although this may not be in your budget. Another place to turn for mentoring assistance could be professional associations.

Thus, I accepted when asked last year if I would volunteer to lead the new mentoring committee of the Taxonomy Division of SLA (Special Libraries Association), a professional membership association to which I belong.  Saying that I support mentoring and actually trying to create and foster a mentoring program, however, are quite different matters.  The Taxonomy Division chair at the time suggested creating a list of FAQs and answers on the member website as the primary means of mentoring members. While FAQs are a useful resource, this is not what I had in mind for mentoring. Connecting aspiring taxonomists (protégés) with experienced taxonomists who volunteer to be mentors would be ideal. Whether this is an achievable ideal or not still waits to be seen.  For now, I have set up the structure of the mentoring programs, as described on the SLA Taxonomy Division website. Now, we just need to encourage participation. My next blog post will describe the program in more detail.

Saturday, July 28, 2012

The Accidental Taxonomy Consultant

It’s well known that most taxonomists become taxonomists by accident, as the title of my book attests.  As I look back on my career, I see this progression continuing one step further in accidentally becoming a taxonomy consultant.

Not all consultants are accidental, though. Bright college graduates in the social sciences with strong analytical skills are often attracted to entry level jobs at consulting firms.  They then pick up technical consulting skills by practice over time, and these could even involve taxonomy work.  As such, they are not accidental consultants, but they may become accidental taxonomy consultants.

Those who are already taxonomists, as myself, often end up as consultants, because that’s where they find the work. Full-time taxonomist jobs are still relatively rare and are often not in one’s geographical location. So, if an experienced taxonomist loses a job due to a layoff or relocation, and looks around and cannot find another conveniently located taxonomist job, consulting becomes an option. Employers of full-time taxonomists tend to be limited to either certain industries (publishing, media, ecommerce, etc.) or to very large companies in any industry with large internal content management needs, but then the taxonomist job is only at their headquarters location. However, companies of all industries and various medium to large sizes have taxonomy needs and can often afford a taxonomy consultant on a temporary project if not a full-time staff member. Thus, taxonomy consultants are in greater demand than are full-time employed taxonomists.

In seeking to contract a taxonomy consultant, you may wonder whether it is better to hire a consultant-turned-taxonomist or a taxonomist-turned-consultant. If you hire a skilled taxonomist who is less experienced in consulting, you ought to get a good taxonomy, although the process might not be that smooth. More likely, though, the experienced taxonomist who is inexperienced in consulting will not likely make as good a first impression and sell the services as well as professional consultant. The professional consultant-turned-taxonomist will provide a better project experience, although the end-result taxonomy may not be as good.  If you can plan and manage the project yourself, then it is the experienced taxonomist you want, but if you want the entire project managed by a consultant, you need a good consultant.

You might not have to compromise, though. A senior enough consultant could be sufficiently skilled in both consulting and taxonomies, that the career sequence does not matter.  If you can afford to hire a firm or partnership, or even a consultant with subcontractors, you may not need to make the choice of experience either, because you can hopefully get some of each on the consultant team serving you. That’s why you should look at the resumes of each member of a consulting team, to ensure that at least one member has very solid taxonomy experience, while at least another member has considerable consulting and project management experience.

Among the things I have learned about consulting is that it helps to have standard consulting processes and procedures, including standard questions that the consultant should ask the client at the very beginning of a project to clarify the scope and understand the context. Consulting firms may additionally have standard deliverables, reports, etc. But in the particular field of taxonomy consulting, the variables are too great, and standard deliverables rarely fit.

There are a lot of books on consulting, but none about taxonomy consulting. When I came across a potential title,
Information Consulting: Guide to Good PracticeI (Chandos Publishing, 2011), I found that even this book addressed consulting more generally, and when it occasionally discussed “information consulting” it was more about the work of independent research librarians. So, accidental taxonomy consultants lack written guidance that is just for them.

This is my story. I became a taxonomist by accident. Then after getting laid off, more than once, I became a taxonomy consultant by accident. Then I joined a consulting company of intentional consultants, some turned taxonomy-consultant by accident, but I did not feel I fit in with them or their choice of projects, since I was a taxonomist first. So, I recently chose to go on my own again as an independent consultant or partnering with another on a case-by-case basis.