Showing posts with label Translations. Show all posts
Showing posts with label Translations. Show all posts

Friday, November 27, 2015

Taxonomies and Terminologies

The current specialties of taxonomy management and terminology management have different histories and serve different purposes, but they are in fact closely related, and taxonomies and terminologies can be linked to share knowledge. At the annual Taxonomy Boot Camp conference in Washington, DC, earlier this month I met a terminologist attendee (Beate Früh of Büro b3) from Germany, who explained to me that the fields are quite similar, and that’s why she was attending a taxonomy conference. Also at the conference I met a vendor of a new software company (Jochen Hummel, CEO of Coreon), whose product provides both taxonomy and terminology management.

As with the field of taxonomies and taxonomy management, there are varying definitions of terminologies and terminology management.  The original meanings of both taxonomy and terminology are as fields of study, with taxonomy being the study of naming and classifying and terminology being the study of terms and their use. More commonly though, we refer to taxonomies and terminologies as sets of terms or concepts for a particular subject area or purpose.

Definitions of terminology include “technical or special terms used in a business, art, science, or special subject” (www.merriam-webster.com), and a “set of designations belonging to one special language” (ISO 1087-1:2000, 3.5.1), with “each designation representing a concept” ISO 25964-2:2013. According to International Information Centre for Terminology (InfoTerm): "The systematic organization and definition of concepts is called terminology management – which also includes classification.” (T.E.R.M.I.N.O.L.O.G.Y. PDF)

Differences


There are several differences between taxonomies and terminologies. The most obvious difference is that taxonomies have hierarchical relationships between the terms/concepts so as to create an overall hierarchical structure, and terminologies generally do not. Other differences are that terminologies contain more detailed terms than are found in a taxonomy for a comparable subject area.  Furthermore, while taxonomies are limited to nouns and noun phrases (including verbal nouns), terminologies may contain some specific adjectives. Terminologies generally include definitions for every term, which is not so typical for taxonomies. Many terminologies are used  to support foreign language translation, so there are usually foreign language equivalents for every term, something found in only a small minority of taxonomies. In general, there is more data for a term in a terminology than in a taxonomy.

The most significant difference between taxonomies and terminologies is how they are used. Taxonomies serve information retrieval, through a combination of indexing/tagging use and browsing/navigation and/or search support. Rather than serve information retrieval, the main purposes of terminologies are to support standard use of terms, especially technical terms, with agreed-upon meaning for creating technical documentation and for foreign language translations. Translation has historically been the field of greatest use of terminologies. As such, many terminologists have a background in translation or linguistics. The co-authors of a leading book in the field of terminology, Handbook of Terminology Management, are both professors of translation.

Another difference is in regional use. Taxonomies are especially widely used in the United States and other English-speaking countries, while growing elsewhere too, whereas terminologies are more widely used in Europe and bilingual countries such as Canada. Member organizations of Infoterm, the independent international association focused on terminology, include numerous organizations in Europe, a few in each of Africa, Asia, Latin America, and Canada, but there are no organizations in the United States.

Finally, there are a greater number of standards for terminologies. There are a large number of currently published standards of ISO committee 37 for Terminology and Other Language and Content Resources, including five standards of the Principles and Methods subcommittee, 14 of the Terminographical and Lexicographical Working Methods subcommittee, and five standards of the Systems to Manage Terminology, Knowledge and Content subcommittee, including ISO 30042:2008 TermBase eXhange (TBX). For taxonomies, on the other hand, standards are fewer, or, if considering specifically taxonomies, there actually are no standards, as the most relevant standards are for thesauri (ISO 25964 or ANSI/NISO Z39.19), ontologies (OWL, based on RDF), or more broadly web-based knowledge organization systems(SKOS).

Similarities


Despite their differences, taxonomies and terminologies both are kinds of vocabularies or controlled vocabularies (depending on how “controlled vocabulary” is defined, the topic of my next blog post). The international standard ISO 25964 Thesauri and interoperability with other vocabularies, (part 1 in 2011 and part 2 in 2013) discusses the following “other” vocabularies (as listed in its table of contents): classification schemes, taxonomies, subject heading schemes, ontologies, terminologies, name authority lists, and synonym rings. Thus, terminologies are listed right along with taxonomies and ontologies. The United States standard ANSI/NISO Z39.19-2005 Guidelines for the Construction, Format and Management of Monolingual Controlled Vocabularies, however, does not include terminologies in its more limited scope: “Controlled vocabularies covered in by this Standard includes lists of controlled terms, synonyms rings, taxonomies, and thesauri.” (Section 2 Scope).

The most important similarity is that both taxonomies and terminologies refer to terms and unique concepts and not to mere words. As such, they often include and bring together synonyms or other variants to disambiguate concepts. While terminologies don’t characteristically have relationships between terms, they sometimes do.

Linkages


Due to these similarities, it is quite feasible to have connections, links, mappings, etc., between terms in a taxonomy and in a terminology.  Taxonomies and terminologies for internal content within the same organization will have a lot of overlap, so it makes sense to leverage the same knowledge bases and either reuse the same terms in taxonomies and terminologies or at least link/map the equivalencies, both to save effort and to ensure consistency of understanding across and organization. ISO 25964-2 Thesauri and interoperability with other vocabularies includes a section on guidelines for the interoperability between thesauri (and, by extension, taxonomies) and terminologies:
  • Concepts may be mapped between a thesaurus and a terminology, and should follow the same methods and best practices as mapping between two thesauri (22.3.2)
  • Terminologies are useful as sources for concept of terms when building or maintaining a thesaurus. They can also be referred to when writing scope notes. (22.3.3)
  • A search thesaurus or synonym ring may be built using a combination of a thesaurus and a terminology. (22.3.4)

Hopefully, more organizations will be developing both taxonomies and terminologies where they are lacking and also build connections between the two.

Find out more about terminologies


Monday, November 28, 2011

Multilingual Taxonomies


We know that taxonomies help information-seekers browse or search for desired documents/information. Taxonomies provide the bridge between the user’s choice of words and the wording within the desired documents. But what if the user actually speaks a different language than that of the content? Documents can be translated (automatically if it’s just to get the general meaning or by human translators when accuracy is important), but that’s only done after the document is found. To support the findability of foreign language documents what is needed is a bilingual or multilingual taxonomy (“bilingual” meaning in two languages, and “multilingual” meaning in three or more languages).

This Thursday, December 1, I will be presenting on the topic of multilingual taxonomies at the Gilbane Conference in Boston, were the focus is web and enterprise content management. This session, which will be shared with the co-speaker Ross Lehrer of WAND, appears to be only one in the conference dedicated to taxonomies and the only presentation with the word “multilingual” in its name.  The topic will be of interest to both those concerned with multilingual content but with no experience with taxonomies and to those with an interest in taxonomies but no experience with multilingual content.

The description of the session (which I did not write) on the conference website says: “Multilingual content dramatically expands the potential market for your products, and multilingual taxonomies often need to be part of your multilingual strategy.” This description applies better to my colleague’s presentation, especially since the taxonomies that his company builds are product taxonomies. My presentation, on the other hand, addresses taxonomies for more than just websites of products, such as taxonomies for retrieving articles written in different languages.

The issue is whether the multilingual content is created and managed internally or externally to your organization. If your multilingual content is what your organization creates, such as additional language versions of a public website for a global market, then it is likely that the content in the different languages is managed internally but separately, by separate language teams. The content is similar but not identical in each language, and the taxonomies that support search and browse may also be created and managed separately. Having taxonomies in different languages, however, is not exactly the same as a “multilingual taxonomy.”

A good analogy would be a translated book. The book’s index should not simply be translated; rather a new index is created by an indexer, who is a native-language speaker of the translated language, based on the newly translated text. Consulting the original language index is fine, but directly translating it will have less than ideal results. Similarly, if you have a website translated into another language, and the website has a taxonomy for browsing for specific content pages, that taxonomy should not simply be translated, but rather a new second-language taxonomy should be created, consulting the first taxonomy, of course.

By contrast, a truly multilingual taxonomy connects users who speak one language to content that is in another language. There needs to be a one-to-one correspondence between terms across both languages, and the different language versions need to be managed together. It’s somewhat complicated to design and create, but software tools are available for this, and the result is a powerful aid to searching and browsing across languages. What is important is to match your multilingual taxonomy design to the specific goals, either (1) service in different language markets, each with their own language content; or (2) users being able to access content in a language which they don’t speak.