Showing posts with label Conferences. Show all posts
Showing posts with label Conferences. Show all posts

Wednesday, January 29, 2025

Talking about Taxonomies in India

I was thrilled to bring together my passions of my taxonomy profession, connecting with people, and international travel on my visit to India this month, my first time to this fascinating country.

I travel to speak about taxonomies at conferences and other events. I like to travel: to meet colleagues in this specialized field, in which I don’t have regular in-person interactions, and to see and learn about new places. Usually for me business travel is the primary purpose and seeing new places (museums or a walking tour of parts of a city) is secondary. However, for January 2025, I decided to choose a new country destination, India, primarily as a tourist, and then to add on some professional events.

Why visit India

Heather Hedden at the Taj Mahal

India is now the most populous country of the world, and I have met many Indians living and working in the U.S. and in Europe, especially in technology roles. So, I wanted to understand the country and culture better. India also has a long rich history and impressive historical structures to visit, tasty food, and different religions and traditions to learn about.

I have many professional connections in India, especially through LinkedIn, more than any other country outside North America and Europe. A few are taxonomists, some have taken my course, some have bought my book, and many have a significant number of shared contacts in my field. I had also made contacts through conferences.

Finally, the use of the English language in professional activities makes it easier for me to participate in events in India: giving presentations and listening to the presentations of others. I cannot simply give a presentation in English in any country.

Multiple presentations and meetings

Taxonomies are relevant to multiple disciplines: library and information science, content and document management, information architecture, knowledge management, and ontologies. To interact with professionals in these different fields, I had to arrange multiple presentations or meetups.

Library and information science students

I have occasionally been asked to give guest lectures on about taxonomies to library/information science school classes. Close to two years ago, a graduate student of library and information science in Bengaluru (Bangalore), Soumyakanta Barik, who had read my book, asked if I would give a guest lecture (remote) to his class of master’s degree students, which I did. Afterwards informed Soumyakanta that I was thinking of coming to India, so perhaps I might present again in person. Even though Soumyakanta had since graduated, he facilitated the contacts to make such a lecture possible, so I gave an update of my prior presentation “Tidbits of Taxonomies.”

Heather Hedden with LIS master's degree students at the Documentation Research and Training Centre of the Indian Statistical Institute, Bangalore

It turned out that this school of library and information science, the Documentation Research and Training Centre at the Indian Statistical Institute, Bangalore, had been founded by Dr. S. R. Ranganathan, the developer of the first major faceted classification system in the world (whom I mention in my book and in a prior blog post on faceted classification) and the father of library science in India.

Taxonomists and ontologists

On LinkedIn, I had over 25 connections with the keyword “taxonomy” and 15 with “ontology” in their profiles located in Bengaluru India, so I didn’t want to limit my presentation in that city to just current students. At my request, the Documentation Research and Training Centre organized a second presentation for me to give later the same day to be open to the public. I presented on a slightly more advanced topic, “From Taxonomy to Ontology,” based on a recent presentation that I gave at the Henry Stewart Semantic Data conference. Although the day I chose to present turned out to be a (minor) holiday, I still had a good audience of close to 30 people.

Heather Hedden with Harish Betrabet and Dr. Sanju Tiwari in Noida

While I did not give that presentation again in Delhi, I did meet two ontologists two days later in the Delhi area (Noida), Dr. Sanju Tiwari, who had been involved in the Knowledge Graph Conference, and Harish Betrabet, an ontologist at Bechtel.

Knowledge managers

Taxonomy work often falls under knowledge management, especially in the area of consulting. Heather Hedden with Soumyakanta Barik and Ved Prakash in Bengaluru
I had noticed that one of my prominent LinkedIn contacts in India (with over 140 shared connections) was a leading knowledge management professional, Ved Prakash. Ved met with me and Soumyakanta for lunch my very first day in India. Ved and I have both been involved in Stan Garfield’s SIKM group of knowledge managers, and Ved invited me to now to join the KMGN (Knowledge Management Global Network) group on LinkedIn, which he leads. Knowledge management in India is more mature than the smaller field of taxonomies.

Academic librarians

Heather Hedden with Nabi Hasan and others at the Indian Instittue of Technology, Delhi

I interact with librarians through my membership in the Special Libraries Association (SLA), which has an active Taxonomy Community. At last year's annual SLA conference at the University of Rhode Island, several academic librarians from India, who have been very involved in SLA, participated in the conference and also celebrated the 25th anniversary of the SLA Asia chapter with an event which I attended. The director of the Central Library of the Indian Institute of Technology, Delhi, Nabi Hasan, invited to give a presentation, and then organized a full-day “International Workshop on Open Accessing Publishing” at IIT Delhi around my schedule. To tie taxonomies into the theme, I gave a new presentation “Semantic Standards and Methods for Information Linking.” The audience was not familiar with Semantic Web technologies, so I was pleased to present something new to them, which I hope they will take advantage of.

Former SLA president Seema Rampersad (working at the British Library in London) introduced me, at my request, to another library science professor at the University of Rajesthan in Jaipur, with whom I met on short notice the evening I was visiting that city as a tourist, and we discussed the state of library/information science study.

Technical writers and content managers

Heather Hedden presenting at the STC India event in Bengaluru

With the growth of technology industries and applications of technology in other manufacturing sectors in India, there are now many technical writers along with content/document managers. The Society for Technical Communications (STC) (of which I had previously been a member) has an active chapter in India, so I contacted STC India about organizing a speaking event for me, and I was very pleased that the STC volunteers organized events in both Bengaluru and the greater Delhi area (Noida) to fit my schedule.

Heather Hedden and other speakers and organizers of the STC India event in Noida
The events also each included additional different speakers. I gave the presentation “Indexes, Search, and Taxonomies: Path to Findability,” which I had presented as an STC webinar (not in a suitable time zone for India) in 2023. Taxonomies and indexing are new concepts to many technical writers, whether in the U.S. or India. (My STC contact, Manisha Sardana, will be happy to arrange an event for other visitors to Delhi who want to give an educational presentation.)

Finally, I even met a freelance indexer, a member of the American Society for Indexing, another organization I have belonged to, who attended the STC event in Noida at my invitation.

Summary

I gave more presentations than I initially intended on this trip, but that is partly due to the fact that taxonomies cross over into multiple fields. I then got to meet more people, build and strengthen relationships, and reflect on the field and applications of taxonomies more. The professional activities took three days, while sightseeing took 10 days of my two-week trip. I hope to add on a professional speaking event on future international tourist trips, although I cannot imagine any other country besides India that would offer so many opportunities.

 

Thursday, December 19, 2024

Ontologies vs. Knowledge Graphs

At the Connected Data London (CDL) conference I attended last week, ontologies were humorously referred to as the “O” word. The thought was that, until recently, experts preferred not to mention “ontology,” lest they alienate their audience, customers, or stakeholders. The word comes across as too technical. It is a term from philosophy, after all, and it does not help that it sounds very similar to “oncology” (as “taxonomy” has been confused with “taxidermy”). The term “knowledge graph” on the other hand, is more user friendly, and even if it is not perfectly understood, its general meaning can be guessed. Thus, people would refer to knowledge graphs regardless of whether they meant a knowledge graph or an ontology.

At the conference, however, it was discussed that there is a growing acceptance of the word “ontology,” not just among experts but also among varied stakeholders who need to implement them. This was noted by several conference speakers, especially in the wrap-up panel session for the Data Modeling track, which was titled “The ‘O’ Word: How Ontologies Drive Interoperable Data and Business Innovation.” The panel moderator Katariina Kari explained that this recent shift has happened because of LLMs, explaining: “We need a reliable natural language repository. LLMs works on a network of mimicking language, LLMs are primed for language.” So, now use of the word ontology can even help a startup get funding from venture capitalists, she observed.

However, there remains some confusion over what an ontology is. At one end there is the difference between ontologies and taxonomies, and at the other end the difference between ontologies and knowledge graphs. I clarified the distinction between taxonomies and ontologies in a prior blog post, “Taxonomies vs. Ontologies” (January 2023). While knowledge graphs are a relatively new concept, and ontologies have existed for much longer, it is the varied understanding of ontologies that has given rise to confusion.

An ontology is defined as a model of a domain of knowledge, which comprises classes (sets of things), attributes (types of characteristics of things) and relationships between classes. According to this definition, an ontology is a somewhat generic model of a domain, and it does not include all of the individual members or instances of each class (such as the names of individual companies in the class called Company) nor the specific attributes of each attribute type (such as the address of each specific company for the attribute type called Address).

However, the W3C recommendation for ontologies, OWL (Web Ontology Language) includes the designation “individuals,” and ontology software tools, such as Protégé, support the inclusion of individuals and their specific attributes. Thus, it is easy to think that an ontology, by definition, includes all specific individuals. But just because OWL covers the recommendation for how to include instances of a class, and software supports the inclusion of instances of classes does not necessarily mean that the instances or individuals are actually a component of an ontology. The ontology experts on this CDL conference panel confirmed that an ontology is the upper-level semantic model.

Then, what do we call an ontology plus all of the individual members (instances) of classes and their specific attributes? That is essentially what a knowledge graph is. This is especially true when individuals are specific to an organization or enterprise, such as names of individual customers, products, employees, etc., and we call that an “enterprise knowledge graph.”

The first applications of ontologies in information/data science were in biomedicine, in which individuals included such things as names organisms (including bacteria and viruses) and chemicals, etc. Thus, the notion of an individual in science is not quite the same as in business, which has also been a source of confusion over what an individual is and the inclusion of individuals in an ontology. In enterprise knowledge graphs, the instances can be very numerous and specific, including individual “events,” such as interactions or transactions.

In conclusion, an ontology is typically a defining feature and component of a knowledge graph, but it is not all of what goes into a knowledge graph. A knowledge graph also includes individuals, which may be named entity instances or they may be specific taxonomy concepts (abstract things that are not unique named entities, such as the concepts “Data ethics” or “Performance measurement”), and a knowledge graph also includes specific attributes of individuals. It may be said that a knowledge graph is the instantiation of an ontology, and an ontology is the knowledge model. Katariina further explained: “knowledge graphs that actually follow an ontology will have an LLM perform better than just a KG that is unharmonized, not yet adhering to a clear ontology.”

Thursday, October 31, 2024

The Semantic Data Conference

I was honored to be accepted to speak at the first “Semantic Data” conference in New York, a one-day event held on October 23, following the inaugural event held in London on June 27. Semantic Data, organized by Henry Stewart (HS) Events, is co-located with its better-known DAM (Digital Asset Management) conference, which has been running for over 20 years in New York, London, and Los Angeles.

The full name of the conference was “Semantic Data: Taxonomy, Ontology, and Knowledge Graphs,” so the conference was less focused on data then on what you can do with data and content when combined with the semantics of taxonomies and ontologies. There was no presentation dedicated to knowledge graphs this time, with only sessions in the single-day one-track event. Less of a focus on knowledge graphs was fine, since the Knowledge Graph Conference, held in New York in May covers that topic very thoroughly over multiple days. The emphasis on “semantics,” though, is welcome, since there is no conference dedicated to that subject in the United States. (There is the SEMANTiCS conference in Europe, but it is semi-academic.)

 

Presentations at Semantic Data, New York

The topics of the sessions for the “Semantic Data” included: securing taxonomy and ontology strategy buy-in, why and how to connect taxonomies and ontologies, use of MS Copilot in taxonomy development, a use case in leveraging an LLM-based for content integration and a consumer-based semantic layer, and how to apply semantic models (taxonomies and ontologies) that reduce biases, especially for machine learning models. The opening keynote by Lulit Tesfaye was on realizing the semantic layer keynote, and the closing keynote by Gary Carlison and Bramm Wessel of the lead sponsor, Factor, was on building an organization semantic mindset. Additional sponsored talks were on how ontologies accelerate innovation in the life sciences, as done by the sponsor SciBite, and how semantics enhances modern data platforms, such as the sponsor Datavid.

I presented “Taxonomies to Ontologies: How When and Why to Connect or Extend.” I summarized the benefits of taxonomies and ontologies, including what you could or could not do with each alone, but what you could do with both combined. The fact that both taxonomies and ontologies are now based on compatible Semantic Web standards, which are supported by many tools, makes it easy to combine or extend them. Whether you are “combining” a taxonomy with an ontology or “extending” a taxonomy into an ontology depends merely on your starting point and definition of ontology. Now that I am again vendor neutral, I included screenshots from four different commercial tools for combined taxonomy/ontology management.

About the Semantic Data Conference 2024

Semantic Data New York was similar to Semantic Data Europe (London) in its format and organization. Both provided a combination of session types: instructional talks, industry use cases, round table participant discussions, and thought leadership panels. Both events were chaired by Madi Weland Solomon and featured the same keynote presentation by Lulit Tesfaye on the subject of the semantic layer. The rest of the speakers were different at both events, and each event had different sponsors, based on geographic location. While there were only three sponsors of Semantic Data in New York and only two in London, they shared the same exhibit hall with the main DAM (digital asset management) and thus reached a wider audience.

Attendees of both the London and New York events had a similar number of registrants, about 50. Although the larger co-located DAM conference had separate registration, some registrants of the DAM conference were also seen in Semantic Data sessions. Registrants of Semantic Data represented diverse industries, including financial services, healthcare, software/technology, media, entertainment, publishing, travel and tourism, education, government, and consulting. Roles were also diverse, including company leadership, project and program managers, IT, and content/DAM/taxonomy/information architecture practitioner roles.

I find that the distinction between the roles and activities of taxonomists, ontologists, information architects, digital asset managers, etc. overlaps, so a conference dedicated to semantics brings them together for shared knowledge sharing. This way, their projects can also be broadened and shared within their organizations. I hope the Semantic Data conference can grow in the future to fill this need, and I look forward to next year.

Thursday, November 30, 2023

Generative AI at Taxonomy Boot Camp Conference

Generative AI and large language models (LLMs), the technology behind ChatGPT, have been topics of presentations, keynotes, and attendees’ conversations at all the varied conferences I had the fortune to attend this year, including the Taxonomy Boot Camp conference held November 6-7, in Washington, DC. Taxonomy Boot Camp is the only conference dedicated to taxonomies.

Opening and Keynotes

 

Right from the beginning in the opening welcome, the conference chair Stephanie Lemieux mentioned uses of ChatGPT for taxonomy creation, such as asking prompts: What is a category for a following list of terms?, What label for a concept might be better for scientists, or better for parents?, and What are alternative labels for a specific content? It has become clear that generative AI is a tool to assist taxonomists with specific tasks of a project but is not appropriate for automating the entire creation of a taxonomy. Thus, the Taxonomy Boot Camp theme this year, “Humans in the Loop,” was quite apt for the new era of generative AI, even if not specific to it.

 

The Taxonomy Boot Camp opening keynote, “Ontologies in the New Age of AI by Dean Allemang, was on this subject. Dean is more of an ontologist than a taxonomist, hence the title, but he discussed both taxonomies and ontologies. Allemang made the statement that Generative AI “understands” why we need a taxonomy (even if managers do not). He explained that Schema.org has put RDF on many websites, which ChatGPT “reads.” Allemang has found that ChatGPT also performs perfectly on SPARQL queries, the query language for data, including taxonomies, that is in RDF. Allemang gave ChatGPT query examples, such as “Return all the claims we have by claim number, open date, and close date,” and “What is the total loss of each policy where loss is the sum of loss payment, loss reserve, expense, payment, and expense reserve amount?” Allemang advised taxonomists to identify uses for taxonomies that have not been fully delivered on and use generative AI to deliver it, and if people argue that generative AI does not understand their language, taxonomists should build in a link to the taxonomy that makes generative AI understand it.

 

On the second day, Taxonomy Boot Camp registrants  attend the same shared keynote presentations with all of the KMWorld co-located conferences, and this year these mostly dealt with generative AI, including the opening keynote by Dion Hinchcliffe “Tech-Driven Enterprise Thrills & Chills: The Future of Work.” 


Regular Sessions

In addition to being mentioned in various talks, generative AI was also the subject of a session, “ChatGPT, Taxonomist: Opportunities & Challenges in AI-Assisted Taxonomy Development,”  which comprised two separate presentations.

In this session, Xia Lin presented in “Chat GPT and Generative AI for Taxonomy Development” in which he discussed the steps involved in using ChatGPT in two case studies. In one, a taxonomy for data analytics projects of a small business was developed by providing ChatGPT with the scope of the first level of the taxonomy and then asking ChatGPT to expand individual categories by adding subcategories and then to add definitions of terms and categories. The results were reviewed and revised by experts. But Lin did not stop there. He showed the results of asking ChatGPT to provide stakeholder interview questions around a category, and (for those more technically inclined) how to create a ChatGPT plug-in for various defined functions of taxonomy creation, using ChatGPT’s APIs. 

Also in “ChatGPT and Generative AI for Taxonomy Development” Marjorie Hlava and Heather Kotula jointly presented on issues of the use of ChatGPT to create taxonomies and in general. They explained the risks of bias, plagiarism, ethics, data quality, matching the generated taxonomy to the content, and the amplification of errors upon repeating a prompt. In plagiarism, for example, if you ask ChatGPT to return a complete taxonomy on a subject domain in may return a copyrighted taxonomy that cannot be reused without a license.

Generative AI also impacts the topics of other presentations. For example, in the presentation “In Taxonomy We Trust: Building Buy-In for Taxonomy Projects,” Bonnie Griffin mentioned the importance of “continually re-introducing the value of taxonomy, as generative AI captures attention.” It was also the subject of a debate question in somewhat humorous closing sessions “Taxonomy Showdown—Point/Counterpoint With Taxonomy Experts.”

 

More on Taxonomies and AI

Of course, there is more to AI than just generative AI. Other sessions dealt with machine learning for auto-categorization. These included presentations by each Bob Kasenchak and Rachael Maddison in the session “Machine Learning Is Coming forYour Taxonomy,”  (link to Bob’s slides)  and Wytze Vlietstra’s presentation of  “Vision for Modular Taxonomy Product at Elsevier,” in which the program included “shared infrastructure supported by AI-based decision support tools.” In fact, AI has been a theme of Taxonomy Boot Camp in the past, in 2018. It is generative AI based on large language models that is new. 

For some more details on how this technology may be used for taxonomy development, see my prior blog post this spring Taxonomies and ChatGPT.  To get another perspective on this conference, check out the recent blog post by Taxonomy Boot Camp speaker Mary Katherine Barnes Integrating AI: Insights from KMWorld 2023.

Saturday, September 30, 2023

SEMANTiCS Conference 2023: Taxonomies, Knowledge Graphs, and LLMs


The most recent conference I participated in was SEMANTiCS, September 20-22, in Leipzig Germany. This was the 19th year of this European conference focused on the application of semantic technologies and systems. This was also my fourth year presenting a workshop/tutorial on taxonomies and ontologies at the conference. The widespread value of taxonomies across different areas of specialization is indicated by the fact that taxonomy workshops are repeatedly a part of conferences on various subjects, including semantics, knowledge management, library and information science, information architecture, content strategy, and  digital asset management.


Semantics and taxonomies

Semantics means “meaning,” so semantic systems utilize standards to support the encoding of meaning of things/resources and their relations, making the semantics machine-readable. Various standards, guidelines, and data models for semantic systems were developed for what is called the Semantic Web. The Semantic Web goes beyond the simple hyperlinks of the World Wide Web to label shared metadata, specify the kinds of relations. This supports linked data, and the linking of taxonomies to other taxonomies and ontologies and their tagged content or data, which are stored on different servers. 


Just as World Wide Web protocols have been adapted within enterprises (“behind the firewall”), so have Semantic Web standards. You don’t have to share your data publicly to reap the benefits of the Semantic Web: open standards to enable the migration of taxonomies and related data between systems, sharing of data with partners, extracting and transforming data from within silos across the enterprise into a standard format, and the ability to link to data on the Web to bring in new content even if not sharing content out on the Web.


Taxonomies, as controlled vocabularies, have always been about concepts, each with unique understood meaning, not just words or strings of text. So, using taxonomies is using semantics. The Semantic Web standard SKOS (Simple Knowledge Organization System) specifies a data model to make taxonomies and other knowledge organization systems (thesauri, classification systems, etc.) machine-readable and interchangeable on the Web. Semantic Web standards also cover ontologies with RDF-Schema and OWL. By following Semantic Web Standards, taxonomies can easily be linked to and extended with ontologies, and then by linking to data stored in a graph database, knowledge graphs can be built.


The SEMANTiCS conference

The SEMANTiCS conference is somewhat unique by being semi-academic and semi-industry. It has separate academic track and industry track chairs and additional tutorials and workshops. It’s good to bring academia and industry together in a field like this, where research topics can be applied and partnerships can be developed. The location of the conference varies, and it partners with a local higher education institution for logistical support, with graduate students volunteering to help in exchange to getting access to sessions. 


This was the second year that SEMANTiCS combined its conferences with the Language Technology Industry Association, which organized a Language Intelligence track, dealing with technologies for the management of terminology, multilingual content, and machine translation. The conference also includes a one-day track focused on DBpedia, which is not the same first day as the tutorials and workshops. The entire conference lasts three full days, and has a social event one evening, and a dinner on the second evening.  


The conference has industry vendor sponsors, about eight of which were exhibiting, and a few more which did not exhibit. There are also slightly more organizations which are “partners,” including DBpedia, The Alan Turing Institute, and a number of institutes of higher education in Europe which have programs in semantic technologies. Additional organizers include Semantic Web Company, Institut für Angewandte Informatik and the Vjije Universities Amsterdam, representing the three countries where SEMANTiCS has been taking place: Austria, Germany, and Netherlands. 


SEMANTiCS 2023

The 2023 conference was held September 20-22 in Leipzig, Germany, under the leadership of a new chair Sahar Vahdati of Technical University Dresden. There were about 285 participants in person and about one-third as many online. The conference has been hybrid since 2021. There were often six simultaneous sessions. Themed tracks or sessions of multiple speakers included Knowledge Graphs, Reasoning & Recommendation, Natural Language Processing and Large Language Models, Legal & Data Governance, Ontologies Data Management, and Environmental-Social-Governance (ESG). While there was not a life sciences track like last year, there was a themed subject track on cultural heritage. LLMs and ESG were both new topics this year. Poster presentations also covered the range of topics. 


Knowledge graphs is a regular theme at this conference, but this time there was the addition of LLMs. The opening keynote was “Generations of Knowledge Graphs: The Crazy Ideas and the Business” presented by Xin Luna Dong of Meta. She spoke of three generations of knowledge graphs: entity-based knowledge graphs, text-rich knowledge graphs, and dual neural knowledge graphs, using an ontology and LLMs. The second day’s keynote was “Knowledge Graphs in the Age of Large Language Models,” presented by Aiden Hogan of the University of Chile. LLMs and AI topics were also presented in the Knowledge Graphs track, such as in Andreas Blumauer’s talk “Responsible AI and LLMs.” Finally, the moderated closing panel was “Large Language Models and Knowledge Graphs: Status Quo - Risks - Opportunities” with panelists, Andreas Blumauer and Jochen Hummel from software vendors and Kristina Podnar, a digital policy consultant, who were not completely in agreement.


In addition to my 3-hour tutorial, “Knowledge Engineering of Taxonomies and Ontologies,” only slightly updated from last year, I also contributed, along with Lutz Krüger, to Andreas Blumauer’s new 3-hour tutorial “They Key to Sustainable Enterprises: ESG, KNowledge Graphs, and Digitalization.” Adopting an ESG program and complying with upcoming ESG directives requires connecting a lot of information and data and aligning it with requirements and disclosure categories, and this is where a knowledge graph can be extremely helpful. Other tutorials and workshops dealt with data spaces, ontology reasoning, healthcare NLP, NLP for knowledge graph construction, and FAIR ontologies. 


Past and future

Semantic technologies were very new when the conference was first launched in 2005 by Semantic Web Company, even before launching its product PoolParty Semantic Suite. But it’s never been a vendor product-based conference. The main purpose was and still is to promote the understanding and advancement of semantic technologies. Competitor software vendors sponsor and exhibit, and Semantic Web Company has stepped back from a lead organizational role. The conference is not one where sponsors make business in selling their products or services, but rather for raising awareness, making and reinforcing partnerships, exchanging ideas, and general networking, including looking for work. It is more of a community conference than anything else, but it is an open welcoming community, with new people coming every year.


The next SEMANTiCS, celebrating its 20th year, will be September 16 - 18, 2024, in Amsterdam.


Friday, March 31, 2023

Taxonomy and Information Architecture Compared

There is considerable overlap between the fields of information taxonomies and information architecture. Both involve information organization, labeling, search, and findability. In some organizations the job roles and titles are combined. I previously blogged on “Information Architecture and Taxonomies,” observing that “information architecture” in name seemed to be declining while aspects of its practice continued to be strong, since it was an underlying theme in several of the talks at major taxonomy conference, Taxonomy Boot Camp in 2013.

Photo of Information Architecture Conference opening: welcome on the screen and a jazz band playing
Information Architecture Conference opening. Photo Marisela Meskus

This week, for the first time, I am attending in person the Information Architecture Conference, being held in New Orleans March 28 - April 1, so it’s been interesting to hear how information architects consider taxonomies.

How Information Architecture and Taxonomy Overlap

The fields of information architecture and taxonomy are related beyond the stated shared practices of information organization, labeling, search, and findability. 

When I give an introduction to taxonomies, I explain that a taxonomy is an intermediary between users and content to connect users to content by means of terms that the users understand and by the display of the terms in hierarchies, facet-filters, or type-ahead suggestions, which enable users to explore and interact with the taxonomy. This is clearly an aspect of information architecture. 

In my own career path, I discovered taxonomy and information architecture at the same time. I had been working as a “controlled vocabulary editor” and had the opportunity to work on an interdisciplinary team for a newly design information product. A user interface for school library research database included both a hierarchical taxonomy that was designed to fit with a particular user interface. 

At the Information Architecture Conference, I asked for a raise of hands of my session audience of how many had worked with taxonomies, and it seemed to be over 80%. At the conference, I met information architects who specialized in taxonomies, and taxonomists who had an interest and done some work in information architecture. Even though I identify as a taxonomist, I already knew a number of speakers at the Information Architecture conference due to the overlapping communities.

How Information Architecture and Taxonomy Differ

Information architecture is a discipline and a profession that is larger and more established than that of taxonomies. Although taxonomy work is growing, there are still more college courses on information architecture than on taxonomies, more books on information architecture than on taxonomies, and more people with “information architect” than “taxonomist” as a job title (based on LinkedIn searches). 

Listening to sessions at the Information Architecture Conference and having discussions with participants, I began to see a clearer picture on how the fields of information architecture and taxonomies differ.

The Information Architecture Conference brings together a community of professionals who share ideas and experiences. There is no comparable taxonomist community as taxonomy work, compared to information architecture work, tends to be done by those with different professional backgrounds: information architects, librarians, content managers, metadata architects, indexers, ontologists, etc. It’s telling that there is not just one conference at which I present about taxonomies but multiple. (Knowledge management, content strategy, knowledge graphs, and data science are the fields of conferences at which I have spoken about taxonomies in the past year.) The only conference about taxonomies, Taxonomy Boot Camp, is more of specialized track within the KM World conference, and aims to provide taxonomy best practices and case studies to managers and directors of content, product, or knowledge management. It is not really a forum for taxonomists to discuss topics of their profession, as the Information Architecture Conference is.

It seems that information architecture is more of a discipline and a field, whereas taxonomy is more of tool or system (although a very important one). In addition to information architects in organizations in various industries and consultants, the Information Architecture Conference includes professors and students in the field. By contrast taxonomy is not a field of study, research, or focus in academia. It is a focus area only in industry and consulting. Information architecture seems to allow more room for theory than does the taxonomy field. 

How Information Architecture and Taxonomy Are Related

From a "taxonomic" perspective, which is broader? For information architects, taxonomy is narrower than information architecture. There is no doubt that information architecture is broader in various ways, including content/information organization, design, user experience, and even organization of non-digital information spaces. For example, information architects are concerned not only with taxonomies to support searching and browsing for information, but also with content organization and navigation menu structuring in websites and in software user interfaces. 

Taxonomists, on the other hand, do not consider taxonomies as a sub-field of information architecture, but rather consider the two fields as adjacent and closely related. This is because the taxonomies that information architects create tend to be small, such as term lists for metadata properties or facets or as hierarchies to model menu navigation or site maps. Professional taxonomists tend to work on large dynamic taxonomies or thesauri that are used to tag/index and retrieve content or data in one or more systems, often where the user interface is already prescribed.

The related fields or disciplines are also different. Information architecture has a closer relationship with fields of design, user experience, sociology, and psychology. Taxonomy has a closer relationship with indexing/tagging, natural language processing, ontologies, Semantic Web technologies, and knowledge management. One related field shared by both information architecture and taxonomy is structured content, which was also a subject of presentations at this year's Information Architecture conference and the field of my next conference.


Sunday, November 27, 2022

Taxonomies to Bridge Silos

There is increasing interest in organizations to “break down silos” of content and data. Silos may be different software applications, distinct web or intranet content, or merely different computer drives and folders. The goal is to enable search and retrieval across content that is stored in different content/document management systems and shared folders and the analysis and comparison of data stored in different kinds of database management systems, records management systems, and spreadsheets. This results in better, more complete information to enable more informed decisions and knowledge discovery, along with improved user satisfaction, while also saving time. Breaking down or bridging such silos was a theme of my two most recent conferences.

 

LavaCon: Connecting Content Silos

The 20th annual LavaCon conference on content strategy, held October 23-26 in New Orleans, had the theme this year of “Connecting content silos across the Enterprise.” The conference had a number of presentations tied to the theme, 10 of which had “silos” in their titles. Two presentations I especially enjoyed were by leading content strategy consultants about how to connect silos.

Sarah O’Keefe of Scriptorium, in her presentation “From Silo Busting to CaaStle Building,” with a fairy tale castle metaphor, explained that completely unified content cannot be achieved, because CMSs are tuned to specific content domains, corporate websites accommodate different goals of different groups, content silos have their own delivery pipelines, and silos often match the organizational structure. Her solution was to provide Content as a Service (CaaS), or a “CaaStle in the cloud(s).” Silos are kept, allowing for unique requirements, and perhaps reduced in number, but are connected were needed.

Val Swisher of Content Rules, in her presentation “Creating a Unified (Siloed) Content Experience: The Importance of Terminology and Taxonomy,” explained that siloed content results in different user experiences for each silo. But silos are not going away, because there is no single toolset, particular content has its owners, and certain content may be considered special. Therefore, the user experience should be improved to “ensure that all content looks like it comes from the same company” and to “eliminate the confusion that users experience when they consume content created by various silos.” This is done by standardizing the content, the search, page layout, navigation, content types, terminology, and taxonomy.

At LavaCon, I presented a pre-conference workshop with the title “Using Taxonomies and Tagging to Connect Content Across the Enterprise.” While most of my workshop addressed the general principles and best practice for taxonomy creation, along with the basics of tagging, I did discuss a how centrally managed taxonomy, external from but linked to various content management systems and other applications or repositories of content, can bridge silos. Taxonomy management software positioned as “middleware” such as PoolParty, connects to these different content applications and repositories, and then the taxonomy is presented to the user in a single user interface.

Taxonomy Boot Camp: Taxonomy Breaking Down Silos

At the annual Taxonomy Boot Camp conference, held November 7-8 in Washington, DC, and co-located with the KM World conference, I spoke in a two-presentation session titled “Taxonomy Breaking Down Silos.” The idea is that taxonomies provide the connections to break down barriers between different systems and teams. I presented on taxonomy linking jointly with Donna Popky, Senior Taxonomy & Information Architecture Specialist, Harvard Business School. I explained the principles of taxonomy project linking, and Donna presented a case study of taxonomy linking using a hub and spoke method to link separate taxonomies managed by different business units with separate content repositories for different purposes at Harvard Business School. So, this was a case of creating a hub taxonomy linked to the various business unit spoke taxonomies.

The other speaker in the session, Rachael Maddison, Content Infrastructure Architect & Taxonomy Product Manager for Adobe Digital Media Experience and Engagement, presented on taxonomy adoption across corporate silos and not merely content silos. Collaboration plays a role in wider taxonomy adoption, and as Rachael stated: “Mapping or merging can’t happen until there is stakeholder buy-in.

Over the years, my list of the benefits of taxonomies has grown. Linking data, content, and corporate silos are additional benefits. This can be done with a single, enterprise taxonomy or with multiple linked taxonomies. In either case, the taxonomy needs to be managed externally from any individual siloed application in a dedicated taxonomy management system. Taxonomies can then break down corporate silos and connect content and data silos.

Friday, September 30, 2022

Taxonomies and Semantics

How are taxonomies related to “semantics”? I considered this question, as the latest conference I participated was SEMANTiCS, the European conference of semantic technologies, which took place this year in Vienna, Austria, September 13 - 15. Topics presented and discussed in this conference included ontologies, knowledge graphs, semantic models and reasoning, linked open data, machine learning, natural language processing, and other language technologies. Yet taxonomies were also discussed in a number of presentations. In contrast to a conference dedicated to taxonomies, such as Taxonomy Boot Camp, where taxonomies are the focus, at SEMANTiCS, in the context of semantic technologies, taxonomies are a component or an underlying layer in the application of semantic technologies.


Semantics means “meaning.” Like the words “taxonomy” and “ontology,” there is a traditional meaning that is more academic and, in the case of semantics and ontology, also connected to philosophy, but there is also a modern meaning that deals with information science and knowledge management. For example, “semantic search,” means searching for concepts and ideas, not merely matching search strings of text. Thus, a taxonomy or thesaurus supports semantic search by comprising unambiguous concepts of “things, not strings” of text. 


Semantics also implies Semantic Web, with technology that complies with the Semantic Web that have been developed by the World Wide Web Consortium (W3C). The Semantic Web, also known as Web 3.0, is not component of the World Wide Web nor a different web, but rather a kind of extension of the web to include not merely content and simple hyperlinks, but also all kinds of data that is semantically linked (where the links/relationships also have meaning). The Semantic Web allows more complex data, and data stored and organized in graph databases, to be machine-readable. This could be either on the public web or within an organization that follows Semantic Web standards for managing its data and content. 


Taxonomies were mentioned in a number of other presentations as a given foundation to ontologies, semantic networks, or knowledge graphs. For example, taxonomies and ontologies were the basis of knowledge-based recommendation system, described by Andreas Blumauer in his presentation on that subject. In her talk “ Real World Case Studies: Five Success Factors to Implementing an Enterprise Data Fabric,” Lulit Tesfaye explained that the components of a data fabric are metadata, taxonomy, ontology, knowledge graph, connections and integrations, and front-end applications.


A session titled Taxonomies included a talk on “Taxonomy and Terminology,” compared and contrasted taxonomies and terminologies with respect to their kinds of terms and purposes, but also explained the semantics role of taxonomies. The presenter, Klaus Fleischmann, said that terminologies guide content creators, ensuring consistent, correct use of language company-wide, whereas taxonomies provide a semantic layer on top of content and metadata, often for semantic applications. Fleischmann also explained that taxonomies can be extended to ontologies or, in his words, taxonomies “modeled relationships via ontologies.”Also speaking in the Taxonomies session, Nimit Mehta whose presentation was titled “The Semantic Data Stack - A user story on building a data fabric,” Mehta described taxonomies as “A layer between your data and your business applications” and a “governance layer.”


Finally, I presented a taxonomy-related tutorial, although not on taxonomy creation alone, but rather titled “Knowledge Engineering of Taxonomies, Thesauri, and Ontologies,” in which I explained that taxonomies and ontologies are not so much distinct knowledge organization systems, but rather than ontologies are a semantic layer that are applied to and extend a taxonomy, giving it a greater degree of semantics. 


I hope to participate in the next SEMANTiCS conference in September 2023 in Leipzig, Germany.

Sunday, July 31, 2022

Taxonomy Challenges Discussed at SLA Conference

When it comes to conferences dealing with the subject of taxonomy creation, implementation, and maintenance, without a doubt Taxonomy Boot Camp and Taxonomy Boot Camp London are by far the best conferences for their content, speakers, and networking opportunities. However, there are other conferences that have sessions on taxonomies. 

The annual conference of the Special Libraries Association (SLA) usually has multiple taxonomy-related sessions. This year, July 31 - August 2 in Charlotte, NC, the first in-person conference in three years, was no exception.

Thanks to the volunteer programming efforts of SLA’s Taxonomy Community (one of over 20 specialized topic groups, formerly called “Divisions"), the annual conference is able to include multiple taxonomy sessions, some of which bring together multiple speakers, either co-presenting a single talk or coming together. Even sessions not organized by the Taxonomy Community may include taxonomy topics, such as those dealing with knowledge management, information architecture, or research that uses a taxonomy. A Taxonomy Community networking event is also regularly part of the SLA conference.


This year’s conference is hybrid, so some of the taxonomy sessions are in-person, and some are pre-recorded and available on-demand.  Live-streaming was also done for keynotes and some sessions. The following are the in-person taxonomy sessions at the SLA 2022 conference:

  • “The Role of DEI in Taxonomy Development, Maintenance, Search, and Retrieval,” presented by Marisa Hughes. (This presentation on a popular topic was additionally live-streamed and pre-recorded for on-demand viewing.)
  • “Current Challenges and Advanced Taxonomy Topics” panel comprising Marisa Hughes, Heather Kotula, John Bertland, and myself.
  • “Research Sources and Methodologies for Taxonomy Development,” jointly presented by Marisa Hughes and myself.
The following are pre-recorded, on-demand only taxonomy sessions:
  • “There ain’t no Sanity Clause: Taxonomy and Data Analysis” presented by Michele Lamorte
  • “Metadata Governance” presented by John Horodyski


Conference session on diversity, equity, and inclusion in taxonomies

Diversity, Equity & Inclusion (DEI) is a growing area of interest in information management/sharing and content creation.  Marisa Hughes, the taxonomist who edits the APA Thesaurus of Psychology Index Terms explained the challenges of revising the thesaurus terms to reflect DEI, for which she gave the following definitions:

  • Diversity: “The vast range of differences among individuals and groups.”
  • Equity: “The contain of being fair and impartial”
  • Inclusion: “Welcoming and respecting diverse individuals and Groups. Diversity in practice. 

She has been reviewing thousands of terms for accuracy, currency, inclusivity, avoidance of bias, stereotypes, or discrimination. Areas that this DEI review has focused on are:

  • Racial, ethnic, and cultural identity
  • Gender diversity and sexual orientation
  • Age, disability status, and socioeconomic class bias

In the area of disability status, for example, the term should focus on the disability and not the person. Thus, “Hearing impaired” is changed to “Person with hearing loss”; and “Mentally ill” is changed to “Individual with a mental illness.”

Marisa Hughes presenting “The Role of DEI in Taxonomy

Additional challenges include taking the hierarchical relationships, term usage, and change management. If users can see hierarchical relationships, even if not the full hierarchy, these relationships need to be appropriate. For example, certain personal conditions and behaviors should not be narrower to the term “Disorders.” Term frequency of usage (also called “literary warrant”) is important, but the larger goal is to have respectful terms. Change management involves care that the term changes to not impact search and retrieval. Marissa oversees the large job of reindexing content with new terms, and adding change notes or history notes to changes terms. 


Conference panel on current taxonomy challenges

In this session, the four panelists each gave brief opening talks, then were asked questions by the moderator, Judith Theodori, and then it was opened up for general Q&A and discussion with the audience.


I presented on the themes of challenges which came from 138 taxonomist survey responses to the question "What are the pain points or challenges in your taxonomy work?" The leading trends in the responses were:

  1. Achieving stakeholder understanding and buy-in
  2. Competing interests, expectations, and requests
  3. Organizational challenges
  4. Tools and technology inadequacies or not integrated

John Bertland, Digital Librarian and Content Specialist at the Presidio Trust spoke of the taxonomy challenges in his organization including governance at the time organizational change and funding. A specific challenge is expanding and adapting a taxonomy that was originally just for digital asset management to include the content of the intranet.

Current Taxonomy Challenges” panelists Marisa Hughes, 
John Bertland, Heather Kotula, and Heather Hedden

Marisa Hughes, Taxonomist at the American Psychological Association, related the challenge of having to quickly come up with all the COVID related taxonomy in time for the usual thesaurus update scheduled in April 2020. This involved a lot of research on literature that was still rather lacking on the subject. 


Another challenging project was to determine the role of historical data in the vocabulary of 3500 terms for the period of 1967 to 1973, which involved removing offensive terms. It was a judgement call of whether to continue to use a potentially offensive term as a non preferred term (alternative label) or not.  Heather Kotula, VP, Marketing and Communications of Access Innovations, Inc., the fourth panelist, also discussed the same subject of excluding pejorative terms, referred to “semantic censorship.” In the end it was concluded that often pejorative terms are actually not that much in use in the documents being tagged.