Monday, November 26, 2012

E-Commerce Taxonomies

Happy Cyber-Monday! Coincidentally, this week, which is cyber-week for some retailers, I am giving a conference presentation, at Gilbane in Boston on November 29, on “Taxonomies for E-Commerce.”

As online shopping grows, the organization of products for sale on e-commerce websites becomes increasingly important, and there is also more standardization. Websites present the option to either search (used by customers who know what they want and what to call it), and browse (used by customers who are not sure about what they want or what to call it). For holiday gift shopping, browsing tends to be more common than usual, so displayed taxonomies take on a particularly high visibility at this time.

For browsing, e-commerce websites typically organize their products into hierarchical categories, which are then narrowed by the use of facets. Top level categories correspond to “departments” and could be as few as 2-3 for a specialty retailer or as many as 12-17 for a general/mass merchandize retailer. Usually the hierarchy extends one or two more levels deeper, although a very large retailer may find the need for an occasional fourth level.

At the lower levels of the hierarchy, the customer may then refine the set of products by use of facets (also known as attributes, filters, refinements, dimensions, “limit by,” or “narrow by”). The facets are for characteristics that cut across multiple categories. Facets may be for size, color, price range, material, brand, style, special features, and perhaps even customer rating. These facets will vary depending on the department or broader category type. The terms within a facet, known as “facet values” or “attribute values,” are usually in a flat list The user selects a value from each of multiple facets in combination. In some cases, if check boxes are provided, the user is permitted to select more than one value from within the same facet.

Typically retailers are more concerned about the selection and implementation of technology than in the design of the taxonomy. After all, a hierarchical taxonomy of products would appear simple to design, and even the facets are not too challenging to develop, especially with lots of competitor e-commerce websites to analyze and compare. However, my experience working as a taxonomy consultant on e-commerce taxonomies has led me to realize that creating and editing e-commerce taxonomies is not as easy as it seems.

My conference presentation discusses seven challenges:

1. Distinguishing a subcategory from a facet value
At the higher levels, categories are obvious. Standard facets (size, color, price range, etc.) are also obvious. But the distinction between the most specific subcategories and specialized facets can get blurred.  Can “type” be a facet? Is a “plaid shirt” a subcategory of shirts, or is plaid a value in a “pattern/type” facet? Are gas and electric stoves subcategories of stoves, or is “energy source” a facet of stoves? Factors to consider in making these decisions include user perceptions and the number of existing levels of subcategories and numbers of facets.

2. Different categorization options
There are often product categories that are difficult to classify. For example, do video games belong in “Toys and Games” or in “Electronics”? Does Home Theater belong in the “Television/Video” or the “Audio/Stereo department? Having the category in both locations, as the polyhierarchy feature of a taxonomy, is possible. But a breadcrumb trail might follow only a single path, not both, and too many polyhierarchies can be confusing to users.

3. Related items
E-commerce taxonomies are hierarchical and generally do not have associative/non-hierarchical relationships between categories. It is not needed in most cases, but accessories to products and related services (installation, repair, etc.) are clearly related to specific product categories. Taxonomic standards might have to be ignored if making such categories narrower to their main product is the only option. But other, creative display options might be possible.

4. Sort order options
Generally a long list of terms, over a dozen, is easier to scan if alphabetized, whereas a short list of under a dozen terms is better suited to some other prescribed “logical” order. Sort order inconsistency will result, however, if the number of subcategories fluctuates. Determining the “logical” order is also a challenge and often centers around what is most important or popular.

5. Competitor website comparisons
For e-commerce taxonomies (unlike enterprise taxonomies), it’s great to be able to compare with competitors. However, often a retailer is somewhat unique, and no single competitor has exactly the same product categories. Furthermore, it’s important to distinguish between category and content comparison from design comparison. Design may be an extension of a retailer’s overall unique brand graphic design.

6. Web site vs. physical store organization
Physical (“brick and mortar”) stores have their own organization for products that might not work online, but there may be pressure to mimic physical store organization to provide a consistent user experience. While it may make sense to have the biggest sellers up front or at the top of the list, product size (a factor in physical store organization), should not necessarily be a factor in online organization.

7. Business needs vs. taxonomy best practices
Online merchants might want to make certain product categories more prominent, by changing the sort order, adding polyhierarchy locations, or even moving a subcategory up a level. It’s important to keep the integrity of the taxonomy intact, though, so that it remains intuitive for the customers to use.

In sum, product taxonomies are not as simple to create as might be expected. Taxonomy design may be under constraints, and business needs can challenge taxonomy standards. Creative solutions may be needed, and customer perspectives need to be considered through creating personas and/or through user testing.

Thursday, November 1, 2012

From Taxonomies to Ontologies: Customized and Semantic Relationships


At this year’s Taxonomy Boot Camp conference, I was invited to present on the panel giving 5-minute “Pecha Kucha” lightning talks, for which this year’s theme was ontology. Just as there are different understandings and usages of “taxonomy,” so are there different understandings and usages of “ontology.”  You can come to if from different angles. If you come to ontologies from the experience of taxonomies and the field of information management, then, most simply, an ontology is a more complex type of taxonomy that contains richer information.
  
In my brief presentation, “From Accidental Taxonomist to Accidental Ontologist,” I summed up the differences between taxonomies and ontologies as follows:
  1.     Relationships: Taxonomies have hierarchical and sometimes a simple “related term” associative, but ontologies have semantic relationships, which are custom-created.
  2.     Term Attributes: Taxonomies generally don’t have term attributes, but ontologies do.
  3.     Term Classes: Taxonomies generally don’t have classes for terms, unless you consider facets as classes, but ontologies do.
  4.     Guidelines/Standards: Taxonomies should follow the ANSI/NISO Z39.19 (2005) or ISO 25964, whereas ontologies are expected to follow the Web Ontology Language (OWL) guidelines and make use of the Resource Description Framework (RDF).
  5.     Purposes: Taxonomies  support indexing/tagging, categorization, and/or classification of content, and in turn information findability and retrieval. The primary purpose of an ontology is to describe a domain of knowledge, and support of indexing/tagging, categorization, classification, findability, and retrieval can be secondary.
  6.     Tools: Some software supports the creation of only taxonomies, some software is for ontologies, and some software can do both quite well.  Additionally, some taxonomy/thesaurus software can support most, if not all, features of ontologies.
Coming at ontologies from taxonomies, the biggest distinguishing feature of ontologies is the semantic nature of the relationships.

In a taxonomy or thesaurus, you may have generic relationships, such as:

     Automobile industry  RT (related term) Cars, and
     Cars RT (related term) Automobile industry

     Ford Motor Company NT (narrower term) Lincoln Division, and
     Lincoln Division BT (broader term) Ford Motor Company


In an ontology, you may have customized, semantic relationships, such as:

     Automobile industry MAN (manufactures) Cars,  and
     Cars IND (manufactured by the industry) Automobile industry

     Ford Motor Company SUB (has subsidiary or division) Lincoln Division,  and
     Lincoln Division PAR (has parent) Ford Motor Company


If you can customize the relationships, does this change a taxonomy into a ontology? No. Customized relationships are just one feature of an ontology, although perhaps the most important feature. In my online course on taxonomies, although I don’t teach how to create ontologies, I do provide a lesson on customized/semantic relationships. It is often desirable to create a more complex taxonomy without necessarily meeting all the requirements of an ontology.

Furthermore, a customized relationship might not be fully semantic. In the example above, the second set of relationships are customized, because they are designated by the ontologist for the particular case. The relationships are also “semantic” because they contain specific meaning. (Semantic means “has meaning.”) It is possible to customize relationships while still not making them fully semantic. You may decide to simply rename the standard relationships for your particular application and audience. For example, you might rename broader term (BT)/narrower term (NT) as “parent/child,” or rename Related Term as “see also.” If your taxonomy/thesaurus software is more sophisticated, it will allow you to specify any number of customized relationships, and thus you can add more nuances of meaning.

A key component of truly semantic relationships as expected in ontologies is the ability to create directional relationships that are distinct in each direction, with reciprocity. Most of these semantic relationships will be variants of  “related term” (RT), rather than variants of the hierarchical relationship. The generic RT relationship, however, is singularly bidirectional. If you simply customized it by renaming it, it would have to be the same in both directions, such has “has partner.” To create a semantic relationship pair, such as MAN (manufactures) and IND (manufactured by the industry), you need a tool that supports ontological relationships and not just “customized” relationships.

If your tool supports customized relationships but not the ability to create distinct pairs of directional relationships that are associative rather than hierarchical, the results cans still be very useful. You may have a “near ontology” if not a strictly defined ontology. For example, you could rename the singular “related term” (RT) as “Manufacturer-Product” with an abbreviation  such as  MAN-PRO (Credit to Alice Redmond-Neal of Access Innovations, Inc. for the example). Thus, the relationship is the same in either direction:

     Automobile industry MAN-PRO Cars,  and
     Cars MAN-PRO Automobile industry


It is not completely semantic, with the directional details missing, but this may be good enough for your purposes. After all, it should be obvious which is the manufacturer and which is the product. Therefore,  taxonomy/thesaurus software that provides most, if not all, features of an ontology may be sufficient, too.

What matters is serving your needs. Rather than calling it an “ontology” when it does not meet all the definitions of an ontology (and causing confusion or disagreement), it may be safer to say your sophisticated taxonomy “has features of an ontology.”