Saturday, January 31, 2015

Taxonomy Software Trends



I reviewed various taxonomy/thesaurus management software offerings recently, in preparation for the last of my 3-part webinar series, Practical Taxonomy Creation, and I noticed some trends since I last looked into software in such detail for my book over 5 years ago: more cloud/web-based software, more SKOS/RDF/Semantic web framework software, and more plugins to SharePoint, content management systems, and search engines.

The number of commercial vendors selling taxonomy/thesaurus management software is not significantly different, as some have left the market, and others have entered, and the rest have continued with updates and improvements. There are fewer commercial low-end, inexpensive, single-user desktop offerings, however. Products I have reviewed in the past and that have gone away include Webchoir TCS-10 Personal and Term Tree 2000. The Mac OS program Cognatrix has been unavailable for the past year, although the vendor intends to release it again as an Apple App Store program following the release of the next major version of Mac OS.

Subscription, web-based software


Synaptica pioneered web-based thesaurus management software when it introduced its product in 1995, when the Web was still young, but now other vendors also offer web-based subscription software. Data Harmony Thesaurus Master from Access Innovations was originally only available in a java-based multi-platform client-server installation. For the past several years a web version has also been available, and Access Innovations president Marjorie Hlava said in an email: “Increasingly our customers use the cloud version of the software.” Newer thesaurus management software products to the market have also been solely cloud-based. These include PoolParty, introduced by the Semantic Web Company in 2009, and TopBraid Enterprise Vocabulary Net (EVN), released by TopQuadrant in 2010. Meanwhile Synaptica began offering Synpatica Express, a cloud-computing solution for individuals and smaller businesses. Finally, the long affordable mainstay MultiTes Pro, a Windows-based desktop program that that has been available since 1983 in a single user version and then also for multiple users, introduced a multi-user cloud version about six years ago, which in 2013 was updated and renamed as MultiTes Online.

The cloud-based software offerings are, of course, priced on annual (and in one case, monthly) subscription fees, instead one-time license costs with lesser priced updates. Hopefully this means that more organizations will try out developing a taxonomy in the appropriate tool with the reduced commitment of cost for a shorter time.

SKOS/RDF/OWL Semantic web framework software


Supporting linked data and interoperability with Semantic Web content has become more important. Therefore, World Wide Web consortium (W3C) recommendations, such as the SKOS (Simple Knowledge Organization System) framework, RDF (Resource Description Framework) specifications, and OWL (Web Ontology Language) are being adopted by newer thesaurus/taxonomy software. The newer products, PoolParty and TopBraid EVN are both built around SKOS models. Synaptica and Data Harmony Thesaurus Master have been able to export to a SKOS and OWL schema for a long time, but it was only in 2013 that Data Harmony added user-defined fields to the SKOS export to include all fields in a term record. Additionally, in 2011 Synaptica introduced an Ontology Publishing Suite to publish an ontology or part of an ontology to the Web.

My first criterion for thesaurus management software is that is that in enforces relationship rules in accordance with the ANSI/NISO Z39.19-2005 standards. SKOS is not an alternative standard, but rather a framework that can be followed in addition to ANSI/NISO Z39.19-2005. Ideally a software product complies with both, and some now do.

Plug-ins and connectors for search and content management


The most common software for internal content management (even though it is not really a content management system) is SharePoint. Prior to 2010, SharePoint handled controlled vocabulary metadata in such a simple way (not even in hierarchies) that there was no point in trying to use taxonomies.  Starting with SharePoint 2010, with its Managed Metadata Services, taxonomies can now be utilized in its Term Store. However, despite Term Store improvements from SharePoint 2010 to 2013, it is still far from having the features and capabilities of a dedicated thesaurus management software product. Thus, ideally you create the taxonomy in the dedicated tool and port it over to SharePoint, and now almost all enterprise-level thesaurus management software products have methods to connect to SharePoint, whether through APIs, plug-ins, or dedicate “connector” modules.

There are also increasing numbers of content management systems and search software products being supported by thesaurus management connections. For example, SmartLogic Semaphore Ontology Manager has integrations with a greater number of applications than in the past, including SharePoint, Google Search Appliance, Apache Solr, OpenText, MarkLogic, and IBM Watson. PoolParty has a WordPress plugin, in addition to integrations with SharePoint and Drupal. Surely more such connections will be added, as I have recently heard of requests for taxonomy imports into Drupal.

3 comments:

  1. Hi Heather,
    My company is looking at purchasing a TMS and are evaluating Mondeca, PoolParty, and Protege. Do you have any insight on how those three differ?
    They each meet our technical requirements, so advice from you would be helpful. Thanks!

    ReplyDelete
  2. If they all meet your technical requirements, then you need to consider usability for your intended users. Protege is significantly more difficult to use that PoolParty. I don't have hands-on experience with Mondeca, though. Have your actual users try a trial version of the software and get their feedback. Mondeca, does have the advantage of additional integrated products, such as to serve automated indexing.

    ReplyDelete
  3. Mondeca has a wonderful UI and superior model for multi-language implementations. It is natively RDF-compatible and is very API friendly. I would add to Heather's comments about Protege - it is probably not appropriate for a non-technical audience.

    ReplyDelete