It’s becoming more common to index images with taxonomy terms, instead
of just text documents or instead of just keyword-tagging of images. A
taxonomy for the subject-indexing of images need not be significantly different than a
taxonomy for indexing textual documents, but other metadata differs, and
the indexing activity is also quite different.
A dedicated taxonomy for images might be needed for various reasons:
1. There is no subject-indexing of text documents by an organization.
2. Different software systems are used by the same organization to manage images and for managing text documents.
3. Text documents of the same organization are large and thus indexed or cataloged at a broader level.
1. No text indexing
Some
organizations have a large image collection, and that is what they
focus their indexing efforts on. They thus design or adapt a taxonomy
specific to their image collection. They likely did not have any
taxonomy for indexing text. They either don’t find the need for text
document search and retrieval, or if they do, they will simply use the
search engine instead, since, after all, search engines can search on
text, unlike images.
2. Different systems
Large image
collections are increasingly managed in dedicated digital asset
management systems, which are designed to support the various metadata
associated with images and other nontext media files. Text documents, on
the other hand, may be managed in document management systems, record
management systems, or collaboration systems such as SharePoint. Each of
these kinds of system support some form of controlled vocabulary for
tagging content. But if the images are in one system and the text
documents are in another system, different controlled vocabularies are
likely to be developed. Of course, a generic “content management system”
may be used for both images and text documents, but many organizations
don’t manage all their content in a single system.
3. Different levels of indexing detail
The
classic example of different levels of detail is for materials at
Library of Congress, which had developed Subject Headings for
descriptive cataloging for library materials, which are generally
monographs, such as books, or video-recordings of films, or sound
recordings of music collections. While the subjects of these works might
be quite specific, they are often not as specific as an individual
graphic material. (An entire book may have numerous specific images.)
But over the years, individual images also became part of its
collection, and the LC Subject Headings were not specific enough, so the
Library of Congress development the Thesaurus for Graphic Materials,
which is freely available. The fact that the Thesaurus for Graphic
Materials exists does not mean that a dedicated thesaurus for images is
always needed, but that it was needed in the context of the Library of
Congress collections and the shortcomings of the Library of Congress
Subject Headings for indexing images.
If you already have a detailed taxonomy for
documents, it certainly can be used for images, as well. Some terms,
such as for abstract concepts (such as “Beliefs”), will simply not be
needed in the image indexing, whereas a new terms might need to be added
(such as the name of a specific type of flower.)
There is
definitely unique metadata for images, of which subjects for indexing
are just a part. Examples of other possible image metadata includes
Creator/photographer, Location shown, Location of creation (camera
location), Collection name, Time or part of day (especially if
outdoors), Date taken (in contrast to date the image was digitized or
edited), Number of people depicted, Copyright, Intended purpose, etc.
The Thesaurus for Graphic Materials has had a separate “genre” facet
that is very specific for types of graphical works (such as terms for
Abstract paintings, Family trees, HVAC drawings, and Magazine covers).
Image metadata standards include the IPTC (International Press
Telecommunications Council)’s Photo Metadata for photojournalism. Different metadata may be needed for different kinds of images (news, commercial/advertising, art, etc.)
Indexing
images is different from indexing text documents. First of all, it’s
mostly manual because automation is very limited in image detection (but
may be able to detect people’s faces). It’s more subjective as to what
is of key importance in an image versus a document. An indexer may also
tend to index for what is not actually depicted but for what is implied,
which often, but not always, should be avoided.
I recently attended a conference presentation on this subject, “Get the Picture: Use Your Taxonomy to Classify Images”
at the SLA conference in Boston earlier this month. The presenter, Ann
Poole from Corbis, mentioned various challenges of image indexing,
including over-indexing by photographer-submitters, indexing for
emotions depicted or implied, and indexing for the backstory of an image
in a known place.
Topics related to information management taxonomies posted by the author of the book, The Accidental Taxonomist.
Wednesday, July 1, 2015
Thursday, June 4, 2015
Taxonomist Trends
Last month I conducted an online survey of 150 taxonomists
(described in my last blog post).
Although the results of which will be used in another publication, it is
interesting to note at this time a few comparisons between the results of this
survey with a similar one I had conducted in late 2008 for my book, The
Accidental Taxonomist. While I added further questions this time, some of the questions
stayed the same for comparison.
We would expect over time that more taxonomists have been doing the work for longer. While this is the case for those in the field for 8-15 years, for those involved in the longest period, over 15 years, surprisingly, the survey results did not indicate this. Those who have done taxonomy work for 15 years or more were 26.2% in 2008 but only 17.6% now. The raw numbers, however, for over 15 years did, in fact, increase. So, the survey percentage indicates that there are proportionally more people who have been involved in taxonomies for an intermediate period of time. At the most beginner level, the numbers and percentage of respondents with less than a year of experience in taxonomies declined, from 9.2% to 3.4%. Those with 1-4 years of experience are about the same, and those with 4-15 years of experience increased from 32.4% to 41.2%. So, these numbers could indicate a maturing of the taxonomist profession, but not a graying of the field.
Trends in taxonomist work situation has not changed much with respect to it being a primary job responsibility vs. secondary and with respect to freelance vs. full-time employed. There was a noticeable difference, though, among those who are freelancers (totaling 17% before and 16% now), that more of them are now doing freelance taxonomy work only “occasionally” compared with before, 8% now compared with 4.7% in 2008, and not as many are doing it “often” as before, 8% compared with 12.5%. The fact that there is work for those who want to do freelance taxonomy work only occasionally, whether on top of another job or in combination with other kinds of freelance work is encouraging for those individuals who want to gradually break into taxonomy work.
Regarding the professional and educational background, the leading degree and prior profession of taxonomists today remains that of librarian, and the percentage has, in fact, increased slightly. Meanwhile, those with a technical background have proportionally decreased. The percentage with an MLS/MLIS degree increased from 48.4% to 54.4% of respondents, and for the options of prior work experience, “librarian” increased from 27.7% to 28.3%. Those with an M.S. or M. Eng. degree decreased from 14.1% to 8.7%. Those with a background in Software/IT decreased from 12.3% to 8.3%, and those with a background in database design, development, or administration, decreased from 6.2% to 1.5%. While the taxonomy field can certainly benefit from those with a technical background, it is not a necessary skill, and we might assume that fewer IT people in taxonomy work since 2008 might be due to an improvement in the economy, whereupon more of those people have found work in IT again.
We would expect over time that more taxonomists have been doing the work for longer. While this is the case for those in the field for 8-15 years, for those involved in the longest period, over 15 years, surprisingly, the survey results did not indicate this. Those who have done taxonomy work for 15 years or more were 26.2% in 2008 but only 17.6% now. The raw numbers, however, for over 15 years did, in fact, increase. So, the survey percentage indicates that there are proportionally more people who have been involved in taxonomies for an intermediate period of time. At the most beginner level, the numbers and percentage of respondents with less than a year of experience in taxonomies declined, from 9.2% to 3.4%. Those with 1-4 years of experience are about the same, and those with 4-15 years of experience increased from 32.4% to 41.2%. So, these numbers could indicate a maturing of the taxonomist profession, but not a graying of the field.
Trends in taxonomist work situation has not changed much with respect to it being a primary job responsibility vs. secondary and with respect to freelance vs. full-time employed. There was a noticeable difference, though, among those who are freelancers (totaling 17% before and 16% now), that more of them are now doing freelance taxonomy work only “occasionally” compared with before, 8% now compared with 4.7% in 2008, and not as many are doing it “often” as before, 8% compared with 12.5%. The fact that there is work for those who want to do freelance taxonomy work only occasionally, whether on top of another job or in combination with other kinds of freelance work is encouraging for those individuals who want to gradually break into taxonomy work.
Regarding the professional and educational background, the leading degree and prior profession of taxonomists today remains that of librarian, and the percentage has, in fact, increased slightly. Meanwhile, those with a technical background have proportionally decreased. The percentage with an MLS/MLIS degree increased from 48.4% to 54.4% of respondents, and for the options of prior work experience, “librarian” increased from 27.7% to 28.3%. Those with an M.S. or M. Eng. degree decreased from 14.1% to 8.7%. Those with a background in Software/IT decreased from 12.3% to 8.3%, and those with a background in database design, development, or administration, decreased from 6.2% to 1.5%. While the taxonomy field can certainly benefit from those with a technical background, it is not a necessary skill, and we might assume that fewer IT people in taxonomy work since 2008 might be due to an improvement in the economy, whereupon more of those people have found work in IT again.
In other areas, knowledge management, content management,
and content strategy are backgrounds that have become more common, whereas “document
management” has decreased. This is likely due to the fact that “content” of
various formats is becoming more common than mere “documents.” Digital asset
management was not even presented as an option, but three respondents wrote in the
blank under “Other.”
Despite the preponderance of MLS/MLIS graduates, still only a minority of respondents had training in taxonomies/classification in college courses, and only a few percentage points more than before, merely reflecting that there were more MLS/MLIS graduates. Those having taken continuing education courses or workshops on taxonomies increased from 13.8% to 20.1%, but there are more such course that did not exist before (including mine). On-the-job training remains the primary means of learning how to create taxonomies. There has been a slight increase in on-the-job “formal” training over “informal” learning and experience, with the percentage with formal on-the-job training having increased from 21.5% to 28.9%. Since this particular survey question permitted multiple responses, the leading response of informal on-the-job learning was 71.1%, but this was the only response option with a decrease (down of 83.1%). This is a good sign that taxonomists seem to be learning the skill in more varied means than the dominant on-the-job experience.
Despite the preponderance of MLS/MLIS graduates, still only a minority of respondents had training in taxonomies/classification in college courses, and only a few percentage points more than before, merely reflecting that there were more MLS/MLIS graduates. Those having taken continuing education courses or workshops on taxonomies increased from 13.8% to 20.1%, but there are more such course that did not exist before (including mine). On-the-job training remains the primary means of learning how to create taxonomies. There has been a slight increase in on-the-job “formal” training over “informal” learning and experience, with the percentage with formal on-the-job training having increased from 21.5% to 28.9%. Since this particular survey question permitted multiple responses, the leading response of informal on-the-job learning was 71.1%, but this was the only response option with a decrease (down of 83.1%). This is a good sign that taxonomists seem to be learning the skill in more varied means than the dominant on-the-job experience.
Monday, May 11, 2015
Taxonomist Survey
I had created a survey of taxonomists to gather some
information for writing my book, The Accidental Taxonomist. It was mainly for
Chapter 2: Who Are Taxonomists? With the
word “taxonomist” in the title, I had to write something about taxonomists, and
not just about taxonomies, and this was the best way I could get more
information than some anecdotes from colleagues.
But that was in late 2008, 6½ years ago. Has there been
change in the industry since? In most fields, 6-7 years is not long at all, but
in field of taxonomies, there could be changes. First of all, there have been
significant changes in the economy over that particular period (recession and
partial recovery), and, at least for internal, enterprise taxonomies, the role
of the taxonomist could be considered something expendable in tight economic
times. (I know, as I was laid off in 2008 and again in 2010.) More
significantly, the field of information science is evolving very rapidly. So, I released a new survey this month.
My previous survey had 9 multiple choice questions and one
open response. I chose to keep those questions with no changes or only minor
wording changes, in order to compare the changes over time. I also decided to
add a few more questions. To help me come up with the questions, I asked for
input from an audience of presentation I have last month ("Taxonomy
Displays: Bridging UX & Taxonomy Design" at the Content Strategy
Seattle Meetup. Suggestions from that group included questions on the size of
taxonomies, job titles, and taxonomy work pain points. The current survey now
has 14 multiple-choice questions, one very short answer (job title), and three
open responses, although all questions are optional, and it is permitted to
skip questions.
Where to find taxonomists to survey
In 2008, I could think of only one logical channel to find
taxonomists, the Yahoo group called Taxonomy Community of Practice.
But it is no longer the only group and no longer the most active. The Taxonomy
Community of Practice Yahoo group averaged only 5 messages per month in the
last 6 months. In contrast, the 6 months around the time of my last survey,
this group average 39 message per month. This is most likely because the
LinkedIn group of the same name, Taxonomy Community of Practice, which was created in September 2007, has
taken over the most of the taxonomy discussions. Furthermore, there are additional LinkedIn
groups, such as “Controlled Vocabularies” and “Thesaurus Professionals.” The American Society for Indexing started a Taxonomies & Controlled Vocabularies Special Interest Group in late 2007, and SLA (Special Libraries Association) started a TaxonomyDivision in 2009, both of which have member discussion
lists.
I have announced the current survey in all of these groups and
more. However, I do not expect to reach significantly more taxonomists than
before. That’s because, whereas the single Yahoo group back in 2008 tended to
be subscribed to by email (individual or digest), the proliferation of groups
and lists of similar or overlapping subjects has led to subscribers/members to
opt out of direct emails. Additionally, email software, such as Gmail, can
filter messages from lists to a category/tab that users may choose to overlook.
So, my email announcements of the survey to groups may go unnoticed by many
group members. It would be tempting to individually contact everyone I know personally
who is involved in taxonomy work, but that could be a personal bias that would
skew the pool of respondents.
Taxonomist tendencies
There have already been enough respondents to the current
survey, that I can safely say that the largest number do taxonomy work as their
primary responsibility, as with the previous survey, and that, like before, the
majority are employees, rather than contractors, freelancers, or independent
consultants. The most common educational or professional background (although
not the majority) is library/information science. What is striking, though, is
that despite the fact that 48% of respondents in 2008 had an MLS/MLIS degree
(and from the early survey returns, the percentage is even slightly higher),
only a small percentage of taxonomists learned taxonomy skills through formal
educational institution coursework. Self-taught through reading, on-the-job experience,
and on-the-job training, and conference workshops or seminars are each methods
of learning taxonomies that are more prevalent than college courses. Additional,
more specific comparisons will be the subject of a future blog post.
Saturday, April 25, 2015
Trends in Hierarchical Taxonomy Displays
Taxonomies connect users to content. So, how a taxonomy is displayed to users is very important in its effectiveness. This is a topic about which I gave a conference presentation back in 2011 and will present again next week. As I update my previous presentation, looking at some of the same public websites with taxonomies, I have observed some changes that might be considered as trends.
While faceted taxonomies (used to filter/refine/limit results by certain criteria with choices of taxonomy terms) have become more common on ecommerce or other database websites, they are not suitable in all circumstances, and when a taxonomy has a large number of topical terms, a hierarchical arrangement of those topics might be better.
Displayed full hierarchical taxonomies, however are more difficult to find. They are not as often the default. Some have disappeared entirely such as the Yahoo directory, which was discontinued in December 2014 after 20 years. (Admittedly, trying to classify as many websites as possible into a hierarchy, as the web keeps growing, is a never ending task.) In other cases, the search box is more prominent on the page, and the link browse categories needs to be hunted for.
In the past, I had observed two main different kinds of hierarchical displays: one-level-per-page and expandable hierarchies with plus signs. The first has evolved, the second is has become rare, and a third method has emerged.
One level of taxonomy hierarchy per page was the design of the former Yahoo directory and had been early on the style followed on other sites. An example that closely follows the Yahoo Directory, is the dmoz/Open Directory Project. A list of category labels or topics at each level takes up the entire screen/page display, without the display of other content. Displaying additional content on every page has become important, so hierarchical taxonomy categories now tend to be confined to more compact lists to free up space on the web page for content. This works for some taxonomies, not all. Meanwhile, a list of terms at the same level that take up the entire page is a style that is rarely followed anymore.
Expandable hierarchy “trees,” typically with plus signs next to topics to expand a topic’s subcategories has become quite rare, at least in public web sites. An example are the USA Today topics. This hierarchical taxonomy design had been developed based on the recognizable desktop file folder structure, such as in Windows. In the meantime, users have become familiar with different representations of topic hierarchies on the web, so mimicking expandable file menus is no longer the only way to engage users. Expandable topic hierarchies are not as easy to update and change on websites and, it can take a long time to load the web page. Expandable hierarchies allow the users to have more than one hierarchical level expanded at once, which facilitates exploring the taxonomy. As much as we taxonomists might enjoy browsing a taxonomy, the goal is to get users to content rather than have them spend time exploring the taxonomy.
A third method of displaying multiple levels of a hierarchical taxonomy is through “fly-out” subcategory lists. Examples include Lynda.com (under "Browse the Library") and Books & Authors. I had not noticed this method before, so it seems to be a new trend. They are similar to submenus in website navigation, but rather than for website navigation, the topics are linked to indexed content items, which are listed in a result set for each subtopic. Fly-out subcategories allow the users to still see the parent category list, if the user wanted to back out to it, like in an expandable tree hierarchy. But unlike an expandable tree hierarchy, you cannot have multiple parent categories expanded at the same time, which is not that important anyway. The fly-out subcategory style is thus a positive trend in hierarchical taxonomy displays.
Tuesday, March 31, 2015
Varied Taxonomy Uses and Taxonomist Functions
Someone asked me recently if taxonomies were applicable to some marketing analytics he was pondering. I was not sure without further discussion. The interesting thing about taxonomies is that they have such varied uses. Perhaps because there is no single dominant use of taxonomies, taxonomists have to go into long explanations of how taxonomies are beneficial. There is no neat list of taxonomy uses. Following are some broad categories of taxonomy usage, all but the last of which, I have worked on.
Sometimes more than one of the goals may be pursued simultaneously by the same owner of the taxonomy. This is when it gets complicated, and it needs to be carefully considered whether a single taxonomy or separate taxonomies would be best.
Building up a clear list of the applications of taxonomies, not something in marketing-speak, and more specific than the areas listed above, would be a worthwhile service of the websites of taxonomy consultants and taxonomist-related professional organizations.
Taxonomy consultants need to ask from the start whether the taxonomy project they are hired to work on will be primarily for internal or external access, and not make assumptions. It could be for both, but usually one purpose is seen as primary. Once, in my earlier days of consulting I made an assumption, and my proposal for an “enterprise taxonomy” was even accepted by the client, before I realized that their taxonomy would be primarily for public web content.
Just as taxonomies may have varied uses, so the functions of a taxonomist are varied. One interesting aspect about the taxonomy field, and taxonomy consulting in particular, is that transcends both internal (employee facing) and external (customer or public facing) functions of an organization. I have personally found this a very interesting aspect of the profession.
Taxonomists who are employed may work in various different departments of an organization. As such, taxonomists could find themselves either part of internal functioning groups (knowledge management, content management, information technology) or external-oriented groups (marketing and related web services). I have worked in the organizational departments of editorial, software product development, information technology (as it was overseeing the SharePoint implementation), and consulting services, all of which while in the role of a taxonomist. Additionally, I have seen taxonomist job postings in departments of marketing, ecommerce, communications, libraries, data governance, financial service operations, information management and technology, and the Information Management and Tech Writing department.
In any organization where one or more taxonomists are employed within a specific department, there are likely taxonomy-related needs in other departments. It would be beneficial to the organization if the taxonomists’ skills could be applied to special taxonomy-related projects outside their home department, such as across both marketing and information management.
- A key component of a product of published information for retrieval (such as in a news, periodical article, or reference database)
- A (partial) solution to an information management problem of an organization
- A method to connect customers to products or services, typically on a website
- A method to connect users to information on a public information-sharing or networking website (monetized by advertising or other means)
- As descriptive metadata in a document management, content management, records management, or digital asset management system, to support tagging and subsequently support retrieval of internal content.
- A method to model data, information, or knowledge to serve an organization’s knowledge management strategy
Sometimes more than one of the goals may be pursued simultaneously by the same owner of the taxonomy. This is when it gets complicated, and it needs to be carefully considered whether a single taxonomy or separate taxonomies would be best.
Building up a clear list of the applications of taxonomies, not something in marketing-speak, and more specific than the areas listed above, would be a worthwhile service of the websites of taxonomy consultants and taxonomist-related professional organizations.
Taxonomy consultants need to ask from the start whether the taxonomy project they are hired to work on will be primarily for internal or external access, and not make assumptions. It could be for both, but usually one purpose is seen as primary. Once, in my earlier days of consulting I made an assumption, and my proposal for an “enterprise taxonomy” was even accepted by the client, before I realized that their taxonomy would be primarily for public web content.
Varied taxonomist job functional areas
Just as taxonomies may have varied uses, so the functions of a taxonomist are varied. One interesting aspect about the taxonomy field, and taxonomy consulting in particular, is that transcends both internal (employee facing) and external (customer or public facing) functions of an organization. I have personally found this a very interesting aspect of the profession.
Taxonomists who are employed may work in various different departments of an organization. As such, taxonomists could find themselves either part of internal functioning groups (knowledge management, content management, information technology) or external-oriented groups (marketing and related web services). I have worked in the organizational departments of editorial, software product development, information technology (as it was overseeing the SharePoint implementation), and consulting services, all of which while in the role of a taxonomist. Additionally, I have seen taxonomist job postings in departments of marketing, ecommerce, communications, libraries, data governance, financial service operations, information management and technology, and the Information Management and Tech Writing department.
In any organization where one or more taxonomists are employed within a specific department, there are likely taxonomy-related needs in other departments. It would be beneficial to the organization if the taxonomists’ skills could be applied to special taxonomy-related projects outside their home department, such as across both marketing and information management.
Subscribe to:
Posts (Atom)