The Accidental Taxonomist: November 2023

Thursday, November 30, 2023

Generative AI at Taxonomy Boot Camp Conference

Generative AI and large language models (LLMs), the technology behind ChatGPT, have been topics of presentations, keynotes, and attendees’ conversations at all the varied conferences I had the fortune to attend this year, including the Taxonomy Boot Camp conference held November 6-7, in Washington, DC. Taxonomy Boot Camp is the only conference dedicated to taxonomies.

Opening and Keynotes

Right from the beginning in the opening welcome, the conference chair Stephanie Lemieux mentioned uses of ChatGPT for taxonomy creation, such as asking prompts: What is a category for a following list of terms?, What label for a concept might be better for scientists, or better for parents?, and What are alternative labels for a specific content? It has become clear that generative AI is a tool to assist taxonomists with specific tasks of a project but is not appropriate for automating the entire creation of a taxonomy. Thus, the Taxonomy Boot Camp theme this year, “Humans in the Loop,” was quite apt for the new era of generative AI, even if not specific to it.

The Taxonomy Boot Camp opening keynote, “Ontologies in the New Age of AI” by Dean Allemang, was on this subject. Dean is more of an ontologist than a taxonomist, hence the title, but he discussed both taxonomies and ontologies. Allemang made the statement that Generative AI “understands” why we need a taxonomy (even if managers do not). He explained that Schema.org has put RDF on many websites, which ChatGPT “reads.” Allemang has found that ChatGPT also performs perfectly on SPARQL queries, the query language for data, including taxonomies, that is in RDF. Allemang gave ChatGPT query examples, such as “Return all the claims we have by claim number, open date, and close date,” and “What is the total loss of each policy where loss is the sum of loss payment, loss reserve, expense, payment, and expense reserve amount?” Allemang advised taxonomists to identify uses for taxonomies that have not been fully delivered on and use generative AI to deliver it, and if people argue that generative AI does not understand their language, taxonomists should build in a link to the taxonomy that makes generative AI understand it.

On the second day, Taxonomy Boot Camp registrants attend the same shared keynote presentations with all of the KMWorld co-located conferences, and this year these mostly dealt with generative AI, including the opening keynote by Dion Hinchcliffe “Tech-Driven Enterprise Thrills & Chills: The Future of Work.”

Regular Sessions

In addition to being mentioned in various talks, generative AI was also the subject of a session, “ChatGPT, Taxonomist: Opportunities & Challenges in AI-Assisted Taxonomy Development,” which comprised two separate presentations.

In this session, Xia Lin presented in “Chat GPT and Generative AI for Taxonomy Development” in which he discussed the steps involved in using ChatGPT in two case studies. In one, a taxonomy for data analytics projects of a small business was developed by providing ChatGPT with the scope of the first level of the taxonomy and then asking ChatGPT to expand individual categories by adding subcategories and then to add definitions of terms and categories. The results were reviewed and revised by experts. But Lin did not stop there. He showed the results of asking ChatGPT to provide stakeholder interview questions around a category, and (for those more technically inclined) how to create a ChatGPT plug-in for various defined functions of taxonomy creation, using ChatGPT’s APIs.

Also in “ChatGPT and Generative AI for Taxonomy Development” Marjorie Hlava and Heather Kotula jointly presented on issues of the use of ChatGPT to create taxonomies and in general. They explained the risks of bias, plagiarism, ethics, data quality, matching the generated taxonomy to the content, and the amplification of errors upon repeating a prompt. In plagiarism, for example, if you ask ChatGPT to return a complete taxonomy on a subject domain in may return a copyrighted taxonomy that cannot be reused without a license.

Generative AI also impacts the topics of other presentations. For example, in the presentation “In Taxonomy We Trust: Building Buy-In for Taxonomy Projects,” Bonnie Griffin mentioned the importance of “continually re-introducing the value of taxonomy, as generative AI captures attention.” It was also the subject of a debate question in somewhat humorous closing sessions “Taxonomy Showdown—Point/Counterpoint With Taxonomy Experts.”

More on Taxonomies and AI

Of course, there is more to AI than just generative AI. Other sessions dealt with machine learning for auto-categorization. These included presentations by each Bob Kasenchak and Rachael Maddison in the session “Machine Learning Is Coming forYour Taxonomy,” (link to Bob’s slides) and Wytze Vlietstra’s presentation of “Vision for Modular Taxonomy Product at Elsevier,” in which the program included “shared infrastructure supported by AI-based decision support tools.” In fact, AI has been a theme of Taxonomy Boot Camp in the past, in 2018. It is generative AI based on large language models that is new.

For some more details on how this technology may be used for taxonomy development, see my prior blog post this spring “Taxonomies and ChatGPT.” To get another perspective on this conference, check out the recent blog post by Taxonomy Boot Camp speaker Mary Katherine Barnes “Integrating AI: Insights from KMWorld 2023.”

Thursday, November 30, 2023

Generative AI at Taxonomy Boot Camp Conference

Opening and Keynotes

Regular Sessions

More on Taxonomies and AI

Subscribe to The Accidental Taxonomist Blog