Monday, July 31, 2023

Knowledge Graphs and Taxonomies

Knowledge graphs have recently emerged as an additional and growing use of taxonomies. A knowledge graph comprises data extracted and stored typically in a graph database with an ontology to semantically link types of data, but usually a knowledge graph also includes a taxonomy, thesaurus, or set of controlled vocabularies to provide consistent labeling. As a result of this combination, people involved in knowledge graphs are taking an interest in taxonomies, and people involved in taxonomies are taking an interest in knowledge graphs.

The traditional and still primary use of taxonomies is to consistently and comprehensively tag and retrieve content, whereas the focus of knowledge graphs is to access and make connections among disparate data. Content tagged and retrieved with taxonomies includes pages in websites, intranets, content management systems; documents in document management systems; and images and video files in digital asset management systems. Knowledge graphs link together data which includes records in databases, customer relationship management systems, product information management systems, and other enterprise systems, and the values in cells in spreadsheets, referenced by their row and column headers. By integrating a taxonomy into a knowledge graph, users can then retrieve both content and data on the same subject together.

What is a knowledge graph? The first non-sponsored definition that pops up today with a Google search not from a vendor is from the the Alan Turning Institute, the U.K. national institute for data science and artificial intelligence, which provides the following explanation on its Knowledge graphs interest group page:

Knowledge graphs (KGs) organise data from multiple sources, capture information about entities of interest in a given domain or task (like people, places or events), and forge connections between them. In data science and AI, knowledge graphs are commonly used to:

  • Facilitate access to and integration of data sources;
  • Add context and depth to other, more data-driven AI techniques such as machine learning; and
  • Serve as bridges between humans and systems, such as generating human-readable explanations, or, on a bigger scale, enabling intelligent systems for scientists and engineers.

From the taxonomy perspective, a knowledge graph is a combination of controlled vocabularies or a taxonomy with the semantic layer of an ontology, which adds custom semantic relations and attributes, plus specific instance data, which is stored in a graph database.  A knowledge graph thus extends the use of a taxonomy beyond content to also include data. From the graph data perspective, a knowledge graph is the gathering of disparate data, which has been extracted, transformed, and loaded (ETL) into a graph database, where it is linked with semantic relations provided by an ontology and described by terms in a taxonomy, and it can be queried and analyzed all in one place. 

GraphViews of SWC ESG Knowledge Graph
GraphViews of SWC ESG Knowledge Graph
It is an important to the definition of a knowledge graph to include its purpose and not just its components. The purposes include providing a unified view of data, easy availability of information, easy integration of new data, secure interoperability, visualization of entities and relations, the possibility of discovery and insights through semantic relations, and the support for complex multi-part queries with quick results. With inclusion of a taxonomy, a knowledge graph can bring together both data and content on in and organization.
 
With such lofty goals, knowledge graphs should be an area of interest not just of data scientists and ontologists, but also of information professionals (including taxonomists) and knowledge managers. This is gradually becoming the case. Knowledge graphs emerged in the 2010, and became popularized with the Google Knowledge Graph introduced in 2012. Knowledge graphs were first introduced at the KMWorld (Knowledge Management) conferences in 2017 as "semantic knowledge graphs,” and were also first mentioned at the Taxonomy Boot Camp conference that year. This November, the KMWorld conference has more talks on knowledge graphs than before. When I proposed multiple topics for this spring’s Information Architecture Conference, the conference chair chose the presentation on an introduction knowledge graphs. I also delivered a similar presentation this year to the joint Special Libraries Association and Medical Libraries Association conference.

I will be giving an updated version of those talks, “Knowledge Graphs for Information Professionals” as a free PoolParty webinar on Thursday, August 17, 11:00 – 12:00 EDT, after which the recording will also be available.