The Accidental Taxonomist: Semantic technologies

Showing posts with label Semantic technologies. Show all posts

Thursday, October 31, 2024

The Semantic Data Conference

I was honored to be accepted to speak at the first “Semantic Data” conference in New York, a one-day event held on October 23, following the inaugural event held in London on June 27. Semantic Data, organized by Henry Stewart (HS) Events, is co-located with its better-known DAM (Digital Asset Management) conference, which has been running for over 20 years in New York, London, and Los Angeles.

The full name of the conference was “Semantic Data: Taxonomy, Ontology, and Knowledge Graphs,” so the conference was less focused on data then on what you can do with data and content when combined with the semantics of taxonomies and ontologies. There was no presentation dedicated to knowledge graphs this time, with only sessions in the single-day one-track event. Less of a focus on knowledge graphs was fine, since the Knowledge Graph Conference, held in New York in May covers that topic very thoroughly over multiple days. The emphasis on “semantics,” though, is welcome, since there is no conference dedicated to that subject in the United States. (There is the SEMANTiCS conference in Europe, but it is semi-academic.)

Presentations at Semantic Data, New York

The topics of the sessions for the “Semantic Data” included: securing taxonomy and ontology strategy buy-in, why and how to connect taxonomies and ontologies, use of MS Copilot in taxonomy development, a use case in leveraging an LLM-based for content integration and a consumer-based semantic layer, and how to apply semantic models (taxonomies and ontologies) that reduce biases, especially for machine learning models. The opening keynote by Lulit Tesfaye was on realizing the semantic layer keynote, and the closing keynote by Gary Carlison and Bramm Wessel of the lead sponsor, Factor, was on building an organization semantic mindset. Additional sponsored talks were on how ontologies accelerate innovation in the life sciences, as done by the sponsor SciBite, and how semantics enhances modern data platforms, such as the sponsor Datavid.

I presented “Taxonomies to Ontologies: How When and Why to Connect or Extend.” I summarized the benefits of taxonomies and ontologies, including what you could or could not do with each alone, but what you could do with both combined. The fact that both taxonomies and ontologies are now based on compatible Semantic Web standards, which are supported by many tools, makes it easy to combine or extend them. Whether you are “combining” a taxonomy with an ontology or “extending” a taxonomy into an ontology depends merely on your starting point and definition of ontology. Now that I am again vendor neutral, I included screenshots from four different commercial tools for combined taxonomy/ontology management.

About the Semantic Data Conference 2024

Semantic Data New York was similar to Semantic Data Europe (London) in its format and organization. Both provided a combination of session types: instructional talks, industry use cases, round table participant discussions, and thought leadership panels. Both events were chaired by Madi Weland Solomon and featured the same keynote presentation by Lulit Tesfaye on the subject of the semantic layer. The rest of the speakers were different at both events, and each event had different sponsors, based on geographic location. While there were only three sponsors of Semantic Data in New York and only two in London, they shared the same exhibit hall with the main DAM (digital asset management) and thus reached a wider audience.

Attendees of both the London and New York events had a similar number of registrants, about 50. Although the larger co-located DAM conference had separate registration, some registrants of the DAM conference were also seen in Semantic Data sessions. Registrants of Semantic Data represented diverse industries, including financial services, healthcare, software/technology, media, entertainment, publishing, travel and tourism, education, government, and consulting. Roles were also diverse, including company leadership, project and program managers, IT, and content/DAM/taxonomy/information architecture practitioner roles.

I find that the distinction between the roles and activities of taxonomists, ontologists, information architects, digital asset managers, etc. overlaps, so a conference dedicated to semantics brings them together for shared knowledge sharing. This way, their projects can also be broadened and shared within their organizations. I hope the Semantic Data conference can grow in the future to fill this need, and I look forward to next year.

Sunday, August 18, 2024

Taxonomies and Ontologies as Semantic Models

In describing what taxonomies and ontologies are and what they can do, we are hearing the word “semantics” more often. “Semantics” means “meaning,” which is nothing new, and taxonomies and ontologies are not new. What is new is that taxonomies and ontologies are now combined more, and we need a way to describe them together, and that involves the description of “semantic.” Furthermore, taxonomies and ontologies are being implemented in new and expanded applications, where the word semantic(s) has significance.

Semantics in Taxonomies and Ontologies

Taxonomies have semantics in their concepts. A taxonomy is not just a term base or a term list, but rather is an organized set of concepts, each with its own unambiguous meaning. The concepts bring together different labels, like “synonyms” for the same thing, and their meaning and usage is further clarified by their arrangement in a hierarchy. It’s often said that a taxonomy comprises “things” (concepts), not mere “strings” (of text).

Ontologies have a higher level of semantics than taxonomies. Even if they don’t contain synonyms, the relationships between concepts (entities) and sets (classes) of entities have additional semantics. The relationships in an ontology are convey meanings beyond mere hierarchy or a generic “related term.” For example, relationships between entities may be “is located in,” “has customer,” and “sells product.” Furthermore, entities in an ontology may have various types of attributes, such as contact information for offices and people, which is another application of semantic data.

Bringing Together Taxonomies and Ontologies

Taxonomies and ontologies have different origins, but now they are increasingly based on shared Semantic Web data models and guidelines, which enables them to be integrated seamlessly. Taxonomies have their origins in library science structures, including thesauri, subject headings, and classification schemes. Ontologies have their origins in computer science and data science with a focus on data models.

Combining them brings the benefits of both: the linguistic aspect of controlled terminology and their synonyms with hierarchical structure in taxonomies and the custom semantic relationships and other additional properties provided by ontologies. This allows users to search for concepts/things, not just text strings while also linking to others things related in a specific way and being able to create complex multi-step queries.

Taxonomies are considered a kind of “controlled vocabulary” or “knowledge organization system.” Ontologies are considered a kind of “knowledge model,” and as a knowledge representation system, rather than a knowledge organization system. When we combine taxonomies and ontologies or speak of them collectively, it’s logical to use the word “semantic,” whether as semantic structures or semantic models, because they both involve semantics and both are usually based on Semantic Web guidelines.

Taxonomies are increasingly based on the Semantic Web recommendation (published by the World Wide Web Consortium) of SKOS (Simple Knowledge Organization System), which is based on RDF (Resource Description Framework). Most ontologies are based on RDF-Schema, an extension of RDF, and OWL (Web Ontology Language), another Semantic Web recommendation. The data models of SKOS, RDF, RDF-S, and OWL may all be integrated into the same knowledge model for a combined taxonomy-ontology. Most software for dedicated taxonomy-ontology management uses these data models.

Semantic Search and Semantic Tagging

Taxonomies support semantic search and tagging. “Semantic search” is the third-ranked autocomplete suggested search phrase in a Google search I did recently on “semantic,” so this is clearly a popular application of semantics. Semantic search refers to search that focuses on concepts and meaning rather than just strings of text. This is not new, but since search that is based on text strings and statistical algorithms is so common, improving search results with the focus on semantics is getting more attention.

Semantic search is best enabled with the tagging of taxonomy concepts, which we may call “semantic tagging” (which I first heard of when asked to write a article on it in 2008). Advanced text analytics technologies, going beyond entity recognition and natural language processing to include natural language understanding so as to analyze sentence structure, syntax, and sentiment, can also yield search results based somewhat on meaning and not just words.

Semantic Data

Taxonomies are traditionally for tagging and retrieving content, whereas ontologies are traditionally for exploring and retrieving data. The combination of a taxonomy and an ontology enables users to retrieve both content and data that are related to each other. Semantics for content is a given, because content (whether text, image, or other media), by its very nature, has meaning. Data by itself may not have much meaning, unless it is related to other data and that relationship has meaning, too. Thus, “semantic data” is significant. We hear reference to “semantic data” much more often than to “semantic content.

You don’t need to add a taxonomy to content to make it “semantic” and understood (rather a taxonomy helps you find the content). However, depending on how data is presented, you may need to add an ontology or at least a semantic data model (a method to describe objects in a database and their relationship to one another) to make data “semantic.” Experts can analyze raw data, but the data is more valuable if non-experts can understand it, too, and that’s why “semantic data” is important. There is also a lot of attention on “semantic data models.”

Semantic Layer

The idea of a “semantic layer” as a framework or approach to make an organization’s information, both data and content, more structured, findable, and actionable, has been gaining popularity recently. Whether the “semantic layer” is new or just a new way of describing something is arguable.

A semantic layer is a standardized framework that organizes and abstracts organizational data and serves as a connector for all knowledge assets. It’s a method to bridge content and data silos through a structured and consistent approach to connecting instead of consolidating data, which data warehouses do. The idea of a “layer” is that it is part of an enterprise-wide architecture of information, data and content, that connects horizontally across siloed content and data repositories. Taxonomies and ontologies, in addition to potentially other knowledge organization systems, such as a business glossary, are key components of a semantic layer.

More Talk of Semantics with Taxonomies and Ontologies

I’ve definitely been hearing of “semantics” more in the world of taxonomies and ontologies, and now I am bringing the word more into my own presentations. Following are some past and future examples.

“Core Concepts of Semantic Intelligence” was a presentation I gave in June 2022 in the Semantic Content Graph Guild , a community of practice led by Michael Iantosca
“The Role of Taxonomy and Ontology in Semantic Layers” was a webinar in which I presented in April 2024
“Enterprise Knowledge Graphs: The Importance of Semantics,” was a presentation I gave at Data Summit conference in May 2024.
“Semantic Data: Taxonomy, Ontology, and Knowledge Graphs” is the name of a new conference organized by Henry Stewart Events, first held on June 27 in London and upcoming on October 23 in New York. I will be presenting at it.
“Semantic Data Enrichment: Taxonomies and Ontologies” is a new asynchronous course I will teach through eLearningCurve and which will be available in spring 2025.

Saturday, September 30, 2023

SEMANTiCS Conference 2023: Taxonomies, Knowledge Graphs, and LLMs

The most recent conference I participated in was SEMANTiCS, September 20-22, in Leipzig Germany. This was the 19th year of this European conference focused on the application of semantic technologies and systems. This was also my fourth year presenting a workshop/tutorial on taxonomies and ontologies at the conference. The widespread value of taxonomies across different areas of specialization is indicated by the fact that taxonomy workshops are repeatedly a part of conferences on various subjects, including semantics, knowledge management, library and information science, information architecture, content strategy, and digital asset management.

Semantics and taxonomies

Semantics means “meaning,” so semantic systems utilize standards to support the encoding of meaning of things/resources and their relations, making the semantics machine-readable. Various standards, guidelines, and data models for semantic systems were developed for what is called the Semantic Web. The Semantic Web goes beyond the simple hyperlinks of the World Wide Web to label shared metadata, specify the kinds of relations. This supports linked data, and the linking of taxonomies to other taxonomies and ontologies and their tagged content or data, which are stored on different servers.

Just as World Wide Web protocols have been adapted within enterprises (“behind the firewall”), so have Semantic Web standards. You don’t have to share your data publicly to reap the benefits of the Semantic Web: open standards to enable the migration of taxonomies and related data between systems, sharing of data with partners, extracting and transforming data from within silos across the enterprise into a standard format, and the ability to link to data on the Web to bring in new content even if not sharing content out on the Web.

Taxonomies, as controlled vocabularies, have always been about concepts, each with unique understood meaning, not just words or strings of text. So, using taxonomies is using semantics. The Semantic Web standard SKOS (Simple Knowledge Organization System) specifies a data model to make taxonomies and other knowledge organization systems (thesauri, classification systems, etc.) machine-readable and interchangeable on the Web. Semantic Web standards also cover ontologies with RDF-Schema and OWL. By following Semantic Web Standards, taxonomies can easily be linked to and extended with ontologies, and then by linking to data stored in a graph database, knowledge graphs can be built.

The SEMANTiCS conference

The SEMANTiCS conference is somewhat unique by being semi-academic and semi-industry. It has separate academic track and industry track chairs and additional tutorials and workshops. It’s good to bring academia and industry together in a field like this, where research topics can be applied and partnerships can be developed. The location of the conference varies, and it partners with a local higher education institution for logistical support, with graduate students volunteering to help in exchange to getting access to sessions.

This was the second year that SEMANTiCS combined its conferences with the Language Technology Industry Association, which organized a Language Intelligence track, dealing with technologies for the management of terminology, multilingual content, and machine translation. The conference also includes a one-day track focused on DBpedia, which is not the same first day as the tutorials and workshops. The entire conference lasts three full days, and has a social event one evening, and a dinner on the second evening.

The conference has industry vendor sponsors, about eight of which were exhibiting, and a few more which did not exhibit. There are also slightly more organizations which are “partners,” including DBpedia, The Alan Turing Institute, and a number of institutes of higher education in Europe which have programs in semantic technologies. Additional organizers include Semantic Web Company, Institut für Angewandte Informatik and the Vjije Universities Amsterdam, representing the three countries where SEMANTiCS has been taking place: Austria, Germany, and Netherlands.

SEMANTiCS 2023

The 2023 conference was held September 20-22 in Leipzig, Germany, under the leadership of a new chair Sahar Vahdati of Technical University Dresden. There were about 285 participants in person and about one-third as many online. The conference has been hybrid since 2021. There were often six simultaneous sessions. Themed tracks or sessions of multiple speakers included Knowledge Graphs, Reasoning & Recommendation, Natural Language Processing and Large Language Models, Legal & Data Governance, Ontologies Data Management, and Environmental-Social-Governance (ESG). While there was not a life sciences track like last year, there was a themed subject track on cultural heritage. LLMs and ESG were both new topics this year. Poster presentations also covered the range of topics.

Knowledge graphs is a regular theme at this conference, but this time there was the addition of LLMs. The opening keynote was “Generations of Knowledge Graphs: The Crazy Ideas and the Business” presented by Xin Luna Dong of Meta. She spoke of three generations of knowledge graphs: entity-based knowledge graphs, text-rich knowledge graphs, and dual neural knowledge graphs, using an ontology and LLMs. The second day’s keynote was “Knowledge Graphs in the Age of Large Language Models,” presented by Aiden Hogan of the University of Chile. LLMs and AI topics were also presented in the Knowledge Graphs track, such as in Andreas Blumauer’s talk “Responsible AI and LLMs.” Finally, the moderated closing panel was “Large Language Models and Knowledge Graphs: Status Quo - Risks - Opportunities” with panelists, Andreas Blumauer and Jochen Hummel from software vendors and Kristina Podnar, a digital policy consultant, who were not completely in agreement.

In addition to my 3-hour tutorial, “Knowledge Engineering of Taxonomies and Ontologies,” only slightly updated from last year, I also contributed, along with Lutz Krüger, to Andreas Blumauer’s new 3-hour tutorial “They Key to Sustainable Enterprises: ESG, KNowledge Graphs, and Digitalization.” Adopting an ESG program and complying with upcoming ESG directives requires connecting a lot of information and data and aligning it with requirements and disclosure categories, and this is where a knowledge graph can be extremely helpful. Other tutorials and workshops dealt with data spaces, ontology reasoning, healthcare NLP, NLP for knowledge graph construction, and FAIR ontologies.

Past and future

Semantic technologies were very new when the conference was first launched in 2005 by Semantic Web Company, even before launching its product PoolParty Semantic Suite. But it’s never been a vendor product-based conference. The main purpose was and still is to promote the understanding and advancement of semantic technologies. Competitor software vendors sponsor and exhibit, and Semantic Web Company has stepped back from a lead organizational role. The conference is not one where sponsors make business in selling their products or services, but rather for raising awareness, making and reinforcing partnerships, exchanging ideas, and general networking, including looking for work. It is more of a community conference than anything else, but it is an open welcoming community, with new people coming every year.

The next SEMANTiCS, celebrating its 20th year, will be September 16 - 18, 2024, in Amsterdam.

Friday, September 30, 2022

Taxonomies and Semantics

How are taxonomies related to “semantics”? I considered this question, as the latest conference I participated was SEMANTiCS, the European conference of semantic technologies, which took place this year in Vienna, Austria, September 13 - 15. Topics presented and discussed in this conference included ontologies, knowledge graphs, semantic models and reasoning, linked open data, machine learning, natural language processing, and other language technologies. Yet taxonomies were also discussed in a number of presentations. In contrast to a conference dedicated to taxonomies, such as Taxonomy Boot Camp, where taxonomies are the focus, at SEMANTiCS, in the context of semantic technologies, taxonomies are a component or an underlying layer in the application of semantic technologies.

Semantics means “meaning.” Like the words “taxonomy” and “ontology,” there is a traditional meaning that is more academic and, in the case of semantics and ontology, also connected to philosophy, but there is also a modern meaning that deals with information science and knowledge management. For example, “semantic search,” means searching for concepts and ideas, not merely matching search strings of text. Thus, a taxonomy or thesaurus supports semantic search by comprising unambiguous concepts of “things, not strings” of text.

Semantics also implies Semantic Web, with technology that complies with the Semantic Web that have been developed by the World Wide Web Consortium (W3C). The Semantic Web, also known as Web 3.0, is not component of the World Wide Web nor a different web, but rather a kind of extension of the web to include not merely content and simple hyperlinks, but also all kinds of data that is semantically linked (where the links/relationships also have meaning). The Semantic Web allows more complex data, and data stored and organized in graph databases, to be machine-readable. This could be either on the public web or within an organization that follows Semantic Web standards for managing its data and content.

Taxonomies were mentioned in a number of other presentations as a given foundation to ontologies, semantic networks, or knowledge graphs. For example, taxonomies and ontologies were the basis of knowledge-based recommendation system, described by Andreas Blumauer in his presentation on that subject. In her talk “ Real World Case Studies: Five Success Factors to Implementing an Enterprise Data Fabric,” Lulit Tesfaye explained that the components of a data fabric are metadata, taxonomy, ontology, knowledge graph, connections and integrations, and front-end applications.

A session titled Taxonomies included a talk on “Taxonomy and Terminology,” compared and contrasted taxonomies and terminologies with respect to their kinds of terms and purposes, but also explained the semantics role of taxonomies. The presenter, Klaus Fleischmann, said that terminologies guide content creators, ensuring consistent, correct use of language company-wide, whereas taxonomies provide a semantic layer on top of content and metadata, often for semantic applications. Fleischmann also explained that taxonomies can be extended to ontologies or, in his words, taxonomies “modeled relationships via ontologies.”Also speaking in the Taxonomies session, Nimit Mehta whose presentation was titled “The Semantic Data Stack - A user story on building a data fabric,” Mehta described taxonomies as “A layer between your data and your business applications” and a “governance layer.”

Finally, I presented a taxonomy-related tutorial, although not on taxonomy creation alone, but rather titled “Knowledge Engineering of Taxonomies, Thesauri, and Ontologies,” in which I explained that taxonomies and ontologies are not so much distinct knowledge organization systems, but rather than ontologies are a semantic layer that are applied to and extend a taxonomy, giving it a greater degree of semantics.

I hope to participate in the next SEMANTiCS conference in September 2023 in Leipzig, Germany.

Friday, September 13, 2019

SEMANTiCS Conference

I attended the 15^th annual SEMANTiCS conference this week for the first time. Semantics means “meaning” in language, and in the context of taxonomies and other controlled vocabularies (knowledge organization systems) semantics is a given. We taxonomists don’t concentrate on the topic of semantics that much, because it’s a basic characteristic of knowledge organization systems, which focus on concepts and their meanings, rather than just words. Tagging/indexing with a taxonomy or other kind of knowledge organization system may even be called “semantic enrichment.” Semantics is not a given, however, in related areas of information technology and data science, but more awareness and interest in how technology and semantics can support each other, for better utilization of information, is growing, as this conference demonstrates. These may include technologies and standards of the Semantic Web, but uses go beyond the Web to include various internal enterprise applications.

SEMANTiCS is a European conference that rotates in different cities This year the conference was in Karlsruhe, Germany, for the first time, which turns out to be somewhat of a technology center. Before I went, someone told me to expect European conferences which are not merely spinoffs of American conferences to be different, with perhaps less intermingling, socializing, and networking. That was certainly not the case. I found the attendees, whether German or from other European countries, to be very friendly and open to speaking with and connecting with new colleagues, whether myself or others. So, it was definitely a good networking opportunity.

The SEMANTiCS conference is more in the area of information technology and data science than in fields of content/knowledge management, where we taxonomists tend to be, but, of course, it was not just about technology, but rather about the added “semantic layer.” What I liked is that it brought together taxonomists (I was not the only one) with those who work in technology (software developers, solutions architects, computer scientists, data scientists, etc.). The theme of the conference is knowledge graphs and AI, which have also become themes of the Taxonomy Boot Camp conferences recently. Ontologies, another specialty that bridges the work of taxonomists and computer scientists, were also a focus of this conference. Other topics included machine learning, data governance, and knowledge management.

Heather Hedden presenting at SEMANTiCS conference 2019 in Karlsruhe

Heather Hedden presenting at SEMANTiCS 2019 Karlsruhe

The SEMANTiCS conference is somewhat unique in how it bridges both industry and academia. It has both industry presentations and academic papers, each with separate conference chairs/review committees, and with academic papers to be published as conference proceedings , yet the presentations were not in separate tracks, and both industry and academic presentations were combined into the same sessions by theme. Session themes included knowledge graphs, natural language processing, semantic information management, knowledge discovery & semantic search, knowledge extraction, data integration, and also thesaurus & ontology management (in which I presented). There were also subject-themed tracks on legal technology and on digital humanities/cultural heritage. In each time slot were five consecutive sessions.

SEMANTiCS is not put on by an event company, but is rather a collaborative effort of several organizations, companies and educational institutions, with some variation, depending on the location. The Semantic Web Company has been a consistent organizer/sponsor. Others this year included FIZ Karlsruhe and several European universities.

By the numbers, the conference had 472 registered attendees and 25 sponsors, of which 15 were also exhibitors. There were 37 industry presentations, 28 academic paper presentations, 5 keynote/plenary presentations, 2 invited talks, 1 panel discussion, 31 posters, and 9 preconference workshops/tutorials. This was the largest SEMANTiCS conference to date.

SEMANTiCS Karlsruhe 2019 conference gala dinner

Attendees gather for the conference gala dinner

Particularly exciting was the announcement that, in additional to next September’s conference in Amsterdam, for the first time SEMANTiCS will come to the United States, scheduled for April 21-23 in Austin, Texas: SEMANTiCS Austin 2020. (Call for proposals due November 8.) Lead organizers are the Semantic Web Company and Enterprise Knowledge. The conference won’t be identical to the European version, as it will not have academic papers, but it promises to be very interesting and informative, and I plan to be there.

Thursday, October 31, 2024

The Semantic Data Conference

Presentations at Semantic Data, New York

About the Semantic Data Conference 2024

Sunday, August 18, 2024

Taxonomies and Ontologies as Semantic Models

Semantics in Taxonomies and Ontologies

Bringing Together Taxonomies and Ontologies

Semantic Search and Semantic Tagging

Semantic Data

Semantic Layer

More Talk of Semantics with Taxonomies and Ontologies

Saturday, September 30, 2023

SEMANTiCS Conference 2023: Taxonomies, Knowledge Graphs, and LLMs

Semantics and taxonomies

Semantics and taxonomies

The SEMANTiCS conference

The SEMANTiCS conference

SEMANTiCS 2023

SEMANTiCS 2023

Past and future

Past and future

Friday, September 30, 2022

Taxonomies and Semantics

Friday, September 13, 2019

SEMANTiCS Conference

Subscribe to The Accidental Taxonomist Blog