The Accidental Taxonomist: Knowledge Engineering and Taxonomies

Tuesday, August 31, 2021

Knowledge Engineering and Taxonomies

My next conference workshop (at SEMANTiCS September 7) on taxonomies and ontologies has in its title “knowledge engineering.” I figured this may resonate more with the audience of computer scientists, data scientists, and Semantic technology and AI experts. People come (often accidentally) to the field of designing taxonomies, ontologies, and knowledge organization systems in general from different backgrounds, and may work in different disciplines or departments. They may have very different training, job titles, job descriptions.

Also, my job title now, at Semantic Web Company, is knowledge engineer, although there is not much agreement on what that job title means. I had once before, over 10 years ago, applied for a position with the job title knowledge engineer, and the role focused on writing rules for rules-based auto-classification. This involves using a taxonomy with logic rules and regular expressions for each taxonomy term to support automated indexing, rather than using training sets and machine learning. My current job, however, involves designing taxonomies and ontologies, often in combination.

Creating a taxonomy or thesaurus alone is not knowledge engineering. This is because a taxonomy does not describe all aspects of a knowledge domain, just the concepts and their hierarchical relationships, or in the case of a thesaurus, some additional nonspecific related (See also) relationships. Furthermore, there already exist published guidelines/standards for taxonomies and thesauri, as ANSI/NISO Z39.19 and ISO 25964-1 that specify best practices for design.

It makes more sense to call ontology design a form of knowledge engineering. Ontologies have a much higher level of semantics or expressiveness, which needs to be defined by the ontologist or knowledge engineer. There are customized, semantic relationships (such as “is located in” and “contains”), which are to be applied between designated classes (such as organizations and places), any number of customized attributes (such as address or latitude/longitude) that can be specified for a class. Standards for ontologies, such as OWL, which are from the World Wide Web Consortium (W3C), are only for machine readability and interoperability, but not for best practices, so there is more room for interpretation and innovation when it comes to designing an ontology, than there is for a taxonomy or thesaurus

Knowledge engineering may involve more than designing on ontology but may include all the various kinds of controlled vocabularies for the content and data of an organization. This includes determining what kind of vocabularies are needed and how they are related to each other.

Knowledge engineering is also very similar to knowledge modeling, which I blogged about before in the post "Knowledge Modeling." Knowledge engineering is a more general function, whereas knowledge modeling is a more specific activity.

Knowledge engineering also goes beyond taxonomy/ontology design and creation to include the follow-through application, which is namely the management of tagging or classification of content with the taxonomy. This is, after all, how a useful knowledge base is created, with content tagged and available for retrieval. Definitions of knowledge engineering sometimes refer to it as a field within artificial intelligence (AI) to build knowledge bases. While I might not agree that this is always part of the definition of knowledge engineering, AI is used for automated tagging of content with a taxonomy.

It's probably better to define knowledge engineering more broadly as methods to support the development and transmission of knowledge, specifically by by transforming data to information and information to knowledge, as the frequently depicted pyramid on the right suggests. This transformation is specifically done by designing and creating links between data, which is supported by taxonomies and ontologies.