Thursday, December 29, 2011

From Folders to Facets

A recent taxonomy project I completed involved creating a new taxonomy for a financial services client who was migrating its internal content from shared drive folders to a SharePoint-based intranet, which also included automated indexing and a search engine (FAST). The new taxonomy will help support the search functionality, and taxonomy terms will also display in the left-hand margin (called the Refinement Panel), so that users can refine/narrower their initial search results by selecting terms from several attributes/filters/facets.  The client had already made an attempt at the start of a taxonomy by the time I had become involved. Not surprisingly, the client-created taxonomy followed the structure of the existing folder names quite closely. After all, the folder structure was their only reference point. It became apparent that a taxonomy for folders and a taxonomy for facets, even for the same content, should be designed quite differently.

A hierarchy of nested folders has the following characteristics:
  1. It is designed to gather and group similar documents together.
  2. It is usually designed and created by a person who is uploading/storing documents with the frame of mind of “where can I put these so that I might find them later.”
  3. A document can go into only one folder and thus under only one category.
  4. A folder can be located within only one parent folder.
  5. The hierarchy of nested folders thus may become quite deep, such as six of seven levels.
  6. Folder names at deeper levels can become long and complex to describe a combination of criteria (a taxonomy design characteristic called pre-coordination).

A faceted taxonomy for search refinement has the following characteristics:
  1. It is designed to refine and narrower a search by specific criteria.
  2. It is designed to help all members of an enterprise find documents, including documents uploaded by different people in different departments.
  3. A document can be assigned multiple taxonomy terms, even terms from within the same facet/broad category.
  4. A taxonomy term may display “under” more than one parent taxonomy term, as long as it is a logical hierarchy. (This feature is called “polyhierarchy.”)
  5. The displayed hierarchy of terms is not so deep, usually only three levels.
  6. Taxonomy term names stay simple, since they are intended to be used in combination (a taxonomy design characteristic known as post-coordination).

With this many differences between hierarchical folders and refinement facets, it’s inevitable that the taxonomy for each will differ, even if the content/documents and the users remain the same. Actually, a nested folder structure may or may not even constitute a “taxonomy.” It depends on whether the folder system was designed with a consistent structure and folder names or whether it just grew ad hoc.

A year and a half ago I was involved with a similar taxonomy project for the wind energy company First Wind. In addition to designing a faceted taxonomy for the Refinement Panel to support search in SharePoint, I was also tasked with improving the nested folder structure and folder names already in use in SharePoint, and which was not going to go away. I remember being asked then, if I could just create a single taxonomy for both purposes. The answer was no, not entirely. There would be overlap, but there would also be differences.  To the stakeholders, that seemed like a lot of additional work, but to me, the taxonomist, that’s simply the nature of my work, and I enjoy the diversity of building different kinds of taxonomies. In the end, more work put in the by the taxonomist means less work needed by the users.