At the conference, however, it was discussed that there is a
growing acceptance of the word “ontology,” not just among experts but also
among varied stakeholders who need to implement them. This was noted by several
conference speakers, especially in the wrap-up panel session for the Data
Modeling track, which was titled “The ‘O’ Word: How Ontologies Drive Interoperable
Data and Business Innovation.” The panel moderator Katariina Kari explained
that this recent shift has happened because of LLMs, explaining: “We need a
reliable natural language repository. LLMs works on a network of mimicking
language, LLMs are primed for language.” So, now use of the word ontology can
even help a startup get funding from venture capitalists, she observed.
However, there remains some confusion over what an ontology is. At one end there is the difference between ontologies and taxonomies, and at the other end the difference between ontologies and knowledge graphs. I clarified the distinction between taxonomies and ontologies in a prior blog post, “Taxonomies vs. Ontologies” (January 2023). While knowledge graphs are a relatively new concept, and ontologies have existed for much longer, it is the varied understanding of ontologies that has given rise to confusion.
An ontology is defined as a model of a domain of knowledge, which comprises classes (sets of things), attributes (types of characteristics of things) and relationships between classes. According to this definition, an ontology is a somewhat generic model of a domain, and it does not include all of the individual members or instances of each class (such as the names of individual companies in the class called Company) nor the specific attributes of each attribute type (such as the address of each specific company for the attribute type called Address).
However, the W3C recommendation for ontologies, OWL (Web Ontology Language) includes the designation “individuals,” and ontology software tools, such as Protégé, support the inclusion of individuals and their specific attributes. Thus, it is easy to think that an ontology, by definition, includes all specific individuals. But just because OWL covers the recommendation for how to include instances of a class, and software supports the inclusion of instances of classes does not necessarily mean that the instances or individuals are actually a component of an ontology. The ontology experts on this CDL conference panel confirmed that an ontology is the upper-level semantic model.
Then, what do we call an ontology plus all of the individual members (instances) of classes and their specific attributes? That is essentially what a knowledge graph is. This is especially true when individuals are specific to an organization or enterprise, such as names of individual customers, products, employees, etc., and we call that an “enterprise knowledge graph.”
The first applications of ontologies in information/data science were in biomedicine, in which individuals included such things as names organisms (including bacteria and viruses) and chemicals, etc. Thus, the notion of an individual in science is not quite the same as in business, which has also been a source of confusion over what an individual is and the inclusion of individuals in an ontology. In enterprise knowledge graphs, the instances can be very numerous and specific, including individual “events,” such as interactions or transactions.
In conclusion, an ontology is typically a defining feature and component of a knowledge graph, but it is not all of what goes into a knowledge graph. A knowledge graph also includes individuals, which may be named entity instances or they may be specific taxonomy concepts (abstract things that are not unique named entities, such as the concepts “Data ethics” or “Performance measurement”), and a knowledge graph also includes specific attributes of individuals. It may be said that a knowledge graph is the instantiation of an ontology, and an ontology is the knowledge model. Katariina further explained: “knowledge graphs that actually follow an ontology will have an LLM perform better than just a KG that is unharmonized, not yet adhering to a clear ontology.”
No comments:
Post a Comment