Saturday, January 30, 2016

Polyhierarchy in the SharePoint Term Store



Last year I had the opportunity to create some taxonomy in the SharePoint Term Store (also called Managed Metadata), and while I am pleased that hierarchical taxonomies are supported in this widely used platform, I had some concerns about the support of polyhierarchy, as information about this capability is inconsistent. So I experimented further. 

Polyhierarchy means a taxonomy term has more than one broader term or parent term. In a traditional hierarchical taxonomy structure, a term has one broader term (unless it is the top term, in which case it has no broader term) and multiple narrower terms. Occasionally, though, the logic of the hierarchy and the practical need to guide users down different possible paths, makes it beneficial to give a term two or more broader terms. It may appear to the user that the term is duplicated in different locations in the taxonomy, but this duplication is in appearances only, because it is the same term and thus linked/indexed to the same content, no matter which broader term path the user clicked down through.

An example would be the term Financial report, which is shown in Figure 1 screenshot from the SharePoint Term Store.
Fig. 1 Financial report as a narrower to the term Financial documents.

It would be practical to have a broader term of Financial documents and another broader term of Reports. Some users will look for the term under Financial documents, and other users will look for it under Reports.

The SharePoint 2010 or 2013 Term Store claims to support the creation of polyhierarchy, but it has significant limitations.

Polyhierarchy permitted only across different hierarchies

 

The support of polyhierarchy in the SharePoint Term Store, takes the notion of “polyhierarchy” too literally by insisting that the two broader terms of a term in a polyhierarchy actually belong to different hierarchies. This means that the polyhierarchy can only be created across different Term Sets in SharePoint. A Term Set is a hierarchy or a facet with a single top term. It is prohibited to create a polyhierarchy within the same Term Set. This is quite problematic, because I find that the vast majority of the time that I want to create a polyhierarhcy it is within the same top-level hierarchy for facet. 

In the example of Financial report, it is logical to have two broader terms of Financial documents and Reports. Both of these broader terms, however, are within the same Term Set or facet, which I might call Document type, so the SharePoint Term Store will not permit this polyhierarchy. Having the term Financial documents appear under a second broader term within any other Term Set or facet, on the other hand, such as the Department or Location facet, is permitted by SharePoint, but this would not be a correct hierarchical structure by taxonomy standards. 

Only one method to create polyhierachy

 

In the SharePoint Term Store, you cannot create a broader term relationship; you can create only narrower term relationships. Thus, you can only create hierarchies from the top down. The normal way to create a polyhierarchy, however, is to add a second broader term relationship, but this is not possible in SharePoint. Instead, the same term has to be made as a narrower term to a second term.

So, if  you have the term Financial report as narrower to Financial documents, and you want to make Reports also a broader term (and Reports exists in another Term Set), you would go to the second term that will be the new broader term (Reports), click on Create Term, and type in the name of an existing term (Financial report). SharePoint, however, does not enforce taxonomy standards and permits you to create a new term with the same name as another term (Financial report), but it will not be the same term. You can see at the bottom of the General information pane, that the duplicate Financial report term’s unique identifier is different from the original Financial reports term., as shown in Figure 2.

Fig. 2 General Information for a selected term


This matters, because terms are used for indexing/tagging. The term with one ID in one location may be indexed to some of the content, and the term with the other ID in the other location will be indexed to other content, and neither term will be indexed to all the content. This would be bad for retrieval. So, this method should not be used to create polyhierarchy.

To create polyhierarchy in SharePoint, go to a second term that is intended to be the additional broader term (Reports), click on Create Term and type in the name of an existing term (Financial report). You will see at the bottom of the screen “Suggestions” with the start of the suggested matching, with yellow highlighted type-ahead matching, to existing terms in another Term Set or even another taxonomy group. If you select one of these suggested terms, then you will indeed be creating a polyhierarchy. After doing so, you will notice that the tag icon preceding the term becomes the “reused tag” icon, as shown in Figure 3, in both locations, under the new broader term and under the existing broader term. You will also notice that when you select the term and view its General details that the data in the box under Member Of shows that the term is a member of both hierarchies.
Fig. 3 Reused tag example for the term Marketing


Importing a taxonomy with polyhierarchy

 

If you import an externally created taxonomy in CSV format as a Term Set via the Term Store’s import feature and that taxonomy has polyhierarchy, the Term Store will not recognize the polyhierarchy, but rather will treat the polyhierarhcy terms as distinct terms with duplicate names, assigning them unique IDs. Thus, they could be used inconsistently in indexing/tagging. Therefore, you should ensure that imported CSV taxonomies should not have any polyhierarchy.

If you import a taxonomy created in an external taxonomy/thesaurus/ontology management system which permits polyhierarchy, and that software has a feature or connector to import to SharePoint Term Store, there are different methods of dealing with the polyhierarchy issue. The default of some software, such as Semaphore Ontology Editor and TopBraid Enterprise Vocabulary Net, is to retain only one of the pair of broader term relationships upon export. For example, in Semaphore, the first hierarchical relationship encountered for a term is retained and any other are not, but the user gets an alert. Wordmap also provides a validation error if there is a polyhierarchy for import into the same Term Set.  Rather than maintaining a random one of more than one broader term relationship, Synaptica strips out all broader term relationships if there are more than one, and then the former polyhierarchy terms show up on the orphan term list for review. In some software, such as TopBraid EVN, the user can define quality/validation rules that would identify polyhierarchy, so the user can remove any before importing into SharePoint. Other software vendors, such as Data Harmony and PoolParty, say they have work-arounds for the SharePoint import to sort of support polyhierarchy, but I have not tested these.

In conclusion, the Term Store’s support of polyhierarchy only across Term Sets (hierarchies or facets) is not very useful, since the majority of time that we would want to create a polyhierarchy, it is within the same Term Set, especially if the Term Set is to be used as a facet. A term with the same name in more than one facet typically would have a slightly different meaning and usage.