Artificial intelligence (AI) is not new, but it is becoming
more ubiquitous, and its applications are growing within other specializations in
information management, knowledge management, and content management, including
taxonomies. Hence the theme for this year’s Taxonomy Boot Camp conference (November
5-6, 2018, Washington DC) was “Bridging Human Thinking and Machine Learning.”
This was the 14th Taxonomy Boot Camp conference
and its 9th year in Washington, DC, which (along with the newer Taxonomy Boot
Camp London) is the only conference dedicated to taxonomies. As usual, it is held
along with several other co-located conferences of Information Today Inc.,
which overlap or are consecutive. The format, as in past years, involved an
opening keynote, after which the conference breaks in two tracks of sessions
the first day, one more basic and one more advanced, then on the second day a
joint keynote with KMWorld conference, and a single track for the rest of the
second day. By a show of hands, it appeared that 75% of the Taxonomy Boot Camp
attendees were first-timers, even more than before. There were 235
attendees, including speakers and sponsors.
While the conference has two tracks the first day, a more
basic and a more advanced track, presentations on machine learning and AI were
in both tracks. These included “Taxonomy
& Machine Learning at the Knot,” “Sandwiches,
Categories, Ethics & Machine Learning,” “Taxonomy Skills in the World
of AI” (a panel), “Semantic
AI: Fusing Machine Learning with Knowledge Graphs,” “Semantic
Search Enrichment,” “Taxonomies
and AI Chat Boxes,” and “Taxonomy
in the Age of Amazon Echo,” and “Applying
Taxonomy Skills to Cognitive Computing” (a project involving IBM Watson
data privacy research product of Thomson Reuters).
In “Semantic
AI: Fusing Machine Learning with Knowledge Graphs,” presenter Andreas
Blumauer of the Semantic Web Company said that increasingly companies are
adopting knowledge graphs as their IT infrastructure, and leading players are
trying to fuse knowledge graphs with machine learning. A knowledge graph has to
be stored in a graph database. There are two types of graph database models:
property graphs and RDF graphs. RDF graphs are more important for knowledge
graphs.
Semantic AI core principles include the following.
•
It’s about things not strings.
•
It’s more than metadata: it describes the
meaning of metadata as an additional, semantic layer.
•
The knowledge graph establishes the semantic
layer.
•
Knowledge graphs can be seen as an input for
machine learning.
•
AI isn’t always good at understanding questions
so a taxonomy/ontology is needed to support it.
•
AI should be built upon data quality, data as a
service, no black box, a hybrid approach, as structured data meeting text,
aiming towards self optimizing machines (a vision, as we are not there yet).
Use cases of knowledge graphs include a recommendation
engine. A knowledge graph is the basis behind the recommendation engine
providing content, taking into consideration users.
In “Taxonomy
& Machine Learning at the Knot,” the presenters of the web media
company the XO Group, started with a good introduction to machine learning,
starting off with explaining the problems it can solve: predicting behavior,
automating tedious steps, and classifying; and that there are two types:
supervised and unsupervised. Common applications include clustering, recommendations,
and classification, and each of these can involve taxonomies. Specific
implementation examples were provided.
As with last year, there was also a lot of talk of
auto-categorization (automated or machine-aided indexing) across various
session. Three were dedicated to the subject: “Driving
Discovery: Combining Taxonomy & Textual AI at Sage” (a case study using
Expert System auto-categorization) “Testing
for Auto-tagging Success” and “Classification
Relevance at Associated Press.” AP has an
automated rules-based classification system for Subjects, Geography, and
Organizations. Rules based auto-classification was chosen over the statistical
method, because it offers transparency and control, breaking news and low
frequency terms can be dealt with (don’t need the existing training set), you
can scope/disambiguate between terms better, such incident type terms (Violent
crime) vs. issue terms (Domestic violence), and semantic rules ensure there is
not must passing mention. Entity extraction with disambiguation rules is used
for person names and publicly-traded companies.
Knowledge graphs are getting more attention both here and at
Taxonomy Boot Camp London. This was, of course, the main topic of the
presentation Andreas Blumauer’s talk “Semantic
AI: Fusing Machine Learning with Knowledge Graphs,” and Mike Doane, in the
introduction of his talk on “Taxonomy
in the Age of Amazon Echo” said that
the information industry analysis firm Gartner reports that knowledge graphs
are on the rise and are discussed more than taxonomies. Gartner is tracking
knowledge graphs instead of taxonomies and ontologies.
While the opening keynote did not focus on AI or machine
learning, it was presentation by a computational linguist, Deborah McGuinness,
a professor of Computer, Cognitive, and Web Sciences, at Rensselaer Polytechnic
Institute. Among other things, she spoke of the Data life cycle, whereby a
computer understandable specification of meaning (semantics) supports enhanced
lifespan and impact of data. She went on to include to specific ontology case examples.
Nearly all session slides are available to download, except
the keynotes, without any login credentials at: http://www.taxonomybootcamp.com/2018/Presentations.aspx