Thursday, January 19, 2012

Taxonomy Merging or Mapping

Yesterday I gave a webinar presentation for members of Taxonomy Division of the SLA professional association, entitled “Taxonomy Updating, Combining, and Translating.” It was not the first time I presented on these topics and on the topic of taxonomy combining (mapping and merging) in particular. What was different this time is that I am currently involved in a project that involves taxonomy merging.  But since I’m right in the middle of the project, I had not had the time to reflect on it and include takeaways from this project in yesterday’s presentation. Now I shall.

Merging and mapping are not the same thing. Merging brings together two taxonomies on the same subject, eliminating duplicate terms, supplementing each other with terms from one or the other taxonomy. The end result is a new and improved taxonomy taking the best of both of the legacy taxonomies. Mapping matches one taxonomy against another, so that terms in one taxonomy may be used for terms in another, such as a user interface taxonomy matching to another taxonomy that had been used to index the content. The end result is that one taxonomy can now retrieve more content.

The project I am involved in was described by the client as “mapping,” but then it became apparent that it was really the merging of taxonomies, not mapping. A second “mapping” component of the project turned out to be more about matching taxonomy to content. While in general it is good practice for the consultant to continue use the client’s own terminology, referring to “merging” as “mapping” was initially confusing. The distinction is important when it comes to activity of term-by-term comparisons, which is done in both merging and mapping.

Equivalent concepts, whether with the same terms or slightly different terms (such as Cars and Automobiles), are easily mapped or merged. So, the distinction between merging and mapping is not that important. In the case of merging you just have to decide which term to use as the preferred term.

The main difference in merging and mapping is how to deal with cases of a term in one taxonomy having a broader meaning that includes a term in the other taxonomy which lacks an equivalent in the first taxonomy. For example, Taxonomy A has the term “Precipitation,” and Taxonomy B has the terms “Snow” and “Rain” but does not have the term “Precipitation.” When you merge taxonomies, you take both terms and establish a hierarchical relationship between them so the merged taxonomy will have Precipitation and two narrower terms of Snow and Rain. As for mapping the taxonomies, you first need to know in which direction you are mapping. If you map from Taxonomy A to Taxonomy B, you cannot map these terms together. You cannot use Snow for Precipitation and you cannot use Rain for Precipitation, because the latter is broader in its meaning. (You could map Precipitation to Weather, if it existed in Taxonomy B, whereas Snow and Rain are left unmapped.)  If you map from Taxonomy B to Taxonomy A, then these terms do map. Precipitation can be used for Snow and for Rain, since it includes both in its meaning.

Mapping needs to be precise, because it is relied upon to retrieve pre-indexed content. Merging, on the other hand does not need to be so precise, as a new taxonomy is the result. Therefore, initially mistaking a merging project for a mapping project, as I first did in this case, is not as bad as mistaking a mapping project for a merging project. I don’t think the latter, is likely, though.

Of course, you may have a project that involves merging two taxonomies and then mapping the resulting new taxonomy back to each of the two legacy taxonomies. This is actually not as much work as two consecutive projects. Adding an additional column in a spreadsheet, you can track merging and mapping at the same time. In fact, my current project involves that.