Presentation
This workshop is being held as part of the Knowledge Graph Conference taking place May 2-5, 2022 in New York City, USA. The workshop is an online event on May 2 from 9:00am to 12:00pm EST(for a total of 180 minutes).
During the last decade, the international community has been interested in the processes of publication and refinement of Knowledge Graphs (KGs) through initiatives such as the Web of Data. A multitude of methods and systems have been developed to address issues related to the acquisition, publication and exploitation of KGs. In particular, considerable progress has been made in the construction and the enrichment of KGs: ontology matching, data integration, fact prediction and validation. This happened thanks to the use of techniques developed in the fields of knowledge representation, reasoning and machine learning. With these advances, more and more industrial applications are now able to produce and process KGs. However, the produced KGs are complex and evolving, which open several new issues.
In this workshop, we seek contributions describing methods and uses-cases that rely on the application of reasoning and machine learning on complex, uncertain and evolving knowledge graphs. These contributions can be applied to different domains such as smart cities, smart health, smart farming, digital humanities and automation of business processes.
The workshop will cover topics around different types of reasoning and complex data treatment:
- hybridization of symbolic and machine learning methods
- spatial reasoning
- temporal reasoning
- reasoning in constrained environments
- reasoning with uncertainty
- reasoning with incompleteness
- reasoning on large volumes of data
- reasoning under open / closed world assumptions
- distributed reasoning
- reasoning for privacy (data access policy, anonymization, rewriting requests, etc.)
- etc …
Organizing committee
- Nathalie HERNANDEZ, IRIT, UT2J : hernande(at)irit.fr
- Catherine ROUSSEY, INRAE : catherine.roussey(at)inrae.fr
- Fatiha SAIS, LISN, INS2I, Université Paris Saclay : sais(at)lri.fr
Program
2 May 9:00am to 12:00pm EST
- 9:05 - Freddy Lecue, Explaining Deep Neural Networks: The Good, the Bad and the Ugly Slides
- The future of AI lies in enabling people to collaborate with machines to solve complex problems. Like any efficient collaboration, this requires good communication, trust, clarity and understanding. XAI (eXplainable AI) aims at addressing such challenges by combining the best of symbolic AI and traditional Machine Learning. Such topic has been studied for years by all different communities of AI, with different definitions, evaluation metrics, motivations and results. This presentation is a snapshot on the work of XAI to date, focusing on Machine Learning and symbolic AI based approaches, while trying to emphasis on the Good, the Bad and the Ugly when computing and expressing explanation for a human.
- Freddy Lecue is the Chief Artificial Intelligence (AI) Scientist at CortAIx (Centre of Research & Technology in Artificial Intelligence eXpertise) at Thales in Montreal - Canada. He is also a research associate at INRIA, in WIMMICS, Sophia Antipolis - France. Before joining the new R&T lab of Thales dedicated to AI, he was AI R&D Lead at Accenture Labs in Ireland from 2016 to 2018. Prior joining Accenture he was a research scientist, lead investigator in large scale reasoning systems at IBM Research from 2011 to 2016, a research fellow at The University of Manchester from 2008 to 2011 and research engineer at Orange Labs from 2005 to 2008. His research area is at the frontier of intelligent / learning / reasoning systems, and Internet of Things. He has a strong interest on Explainable AI i.e., AI systems, models and results which can be explained to human and business experts
- 9:35 - Pierre Monnin
Discovering alignment relations with Graph Convolutional Networks: a biomedical case study Slides
- Knowledge graphs are freely aggregated, published, and edited in the Web of data, and thus may overlap. Hence, a key task resides in aligning (or matching) their content. This task encompasses the identification, within an aggregated knowledge graph, of nodes that are equivalent, more specific, or weakly related. In this work, we propose to match nodes within a knowledge graph by (i) learning node embeddings with Graph Convolutional Networks such that similar nodes have low distances in the embedding space, and (ii) clustering nodes based on their embeddings, in order to suggest alignment relations between nodes of a same cluster. We conducted experiments with this approach on the real world application of aligning knowledge in the field of pharmacogenomics, which motivated our study. We particularly investigated the interplay between domain knowledge and GCN models with the two following focuses. First, we applied inference rules associated with domain knowledge, independently or combined, before learning node embeddings, and we measured the improvements in matching results. Second, while our GCN model is agnostic to the exact alignment relations (e.g., equivalence, weak similarity), we observed that distances in the embedding space are coherent with the “strength” of these different relations (e.g., smaller distances for equivalences), letting us consider clustering and distances in the embedding space as a means to suggest alignment relations in our case study.
- Pierre Monnin is a researcher with the Orange Innovation/Data-AI entity of Orange. He holds a Ph.D. from the University of Lorraine, where he worked on extracting, comparing, and mining knowledge in the biomedical domain of pharmacogenomics in the context of the ANR PractiKPharma project. His Ph.D. work was awarded the "2022 best thesis award" from the French association EGC (Extraction et Gestion des Connaissances - Knowledge Extraction and Management). Pierre's research broadly focuses on knowledge extraction, matching, and mining to build and refine knowledge graphs. He is particularly interested in uncertain knowledge management, graph embedding techniques for knowledge graphs, and hybrid approaches that combine symbolic and numeric methods. He was General co-chair of ALGOS 2020 and is Proceedings & Metadata co-chair of ISWC 2022.
- 10:05 -
Elodie Thieblin Learning Transformation Rules Between Bibliographical Formats Using Genetic Programming Slides
- We present the preliminary results of a study commissioned by the National French Library (BnF). The National French Library (BnF) is migrating its catalogue data from the Intermarc bibliographic format (similar to UniMARC) to Intermarc-NG with manually created rules. To keep their data interoperable with applications which can only deal with Intermarc data for now, they would like to automatically learn the inverse transformation (Intermarc-NG to Intermarc). The catalogue data has not been entirely migrated so far, therefore, the study focused on learning transformation rules from Intermarc to Dublin Core, based on a corpus of bibliographic records in both formats. A proof of concept has been developed using genetic programming resulting in more or less complex rules. We argue that this transformation rule learning algorithm could be applied to other structured data formats.
- Elodie Thiéblin is a PhD developer at Logilab. She mainly works on data management and data publication with regards to the Semantic Web in various domains including archives, education, web of things, libraries.
- 10:35 - Raghava Mutharaju
Neuro-Symbolic Approaches for Description Logic Reasoning
- Neuro-Symbolic AI brings together the neural and symbolic aspects of AI. Symbolic techniques are transparent and neural techniques are robust to noise. So combining these two techniques can lead to stronger AI systems. We will be discussing two neuro-symbolic techniques to reason over description logics such as EL+ and ALC. One of them makes use of embeddings to predict subclass relations, while the other uses reinforcement learning.
- Raghava Mutharaju is an Assistant Professor in the Computer Science and Engineering department of IIIT-Delhi, India and leads the Knowledgeable Computing and Reasoning (KRaCR; pronounced as cracker) Lab. He got his PhD in Computer Science and Engineering from Wright State University, Dayton, OH, USA, in 2016. He has worked in Industry research labs such as GE Research, IBM Research, Bell Labs, and Xerox Research. His research interest is in Semantic Web and in general in Knowledge Representation and Reasoning. This includes knowledge graphs, ontology modelling, reasoning, querying, and its applications. He has published at several venues such as ISWC, ESWC, ECAI, and WISE. He has co-organized workshops at ISWC 2020, WWW 2019, WebSci 2017, ISWC 2015 and tutorials at ISWC 2019, IJCAI 2016, AAAI 2015 and ISWC 2014. He is/has been on the Program Committee of several (Semantic) Web conferences such as AAAI, WWW, ISWC, ESWC, CIKM, K-CAP and SEMANTiCS. Homepage
- 11:05 - Nicolas Seydoux
Decentralized reasoning and personal graphs
- A reasoning system takes data and rules in, to produce inferences. Additional challenge comes when considering subjective reasoning: in this case, input data varies from one agent to the other, necessarily leading to different output inferences. Reasoning at the scale of the Web combines this subjectivity with decentralization: input data must be gathered from a variety of resources, loosely linked with one another. Solid provides a framework to support such reasoning, with a user-centric take on access control. Each user controls who has access to their data using which applications, which transforms the traditional paradigm where data is tied to applications. App-to-app integration is also shifted by this principle: if data from an app is written to a storage controlled by the user rather than being kept within the app's domain, then this output data can be reused as the input of another app, without requiring tight coupling between the app. All of the data is mediated though the user storage. Based on these principles, this talk will dive into applied use cases of Solid for decentralized reasoning, and outline the opportunities and challenges it represents.
- Nicolas Seydoux is an engineer at [Inrupt](https://inrupt.com/), where he works on developer tools to help build applications based for [Solid](https://solidproject.org/). Solid is a set of specifications built on top of Linked Data aiming at putting users in control of their data. This way, data and apps are decorrelated, fostering interoperability, reuse, and portability. In order to move towards a user-centric Web, we need to make developers familiar with Linked Data concepts, help them build data-driven privacy-aware applications, and power their development with high-qulity tools.
- 11:35 - Ghislain Atemizing INFEROps - A Must Have for a Successful Smart Application Deployment - Lessons learned from real-world applications Slides
- INFEROps (Inference Operations) refer to operations needed for making inferences work to solve a particular real-world use-case. I claim that those operations should be clearly identified to avoid frustration or even cause project failure in your semantic web application deployment in production. In this talk, we’ll identify key activities that you should consider when planning to use reasoning for gaining the most value of a KG. We’ll go through a food system application where recipes in natural language are converted into a KG using a modular ontology developed in OWL. Attendees will wake away with a better understanding of how to use reasoners and rules at different stages of such a semantic workflow to make the most of the recommendations of the targeted application.
- Ghislain Atemizing helps companies turn data into knowledge through actionable AI programs. Utilizing semantic technologies, Ghislain assists clients in extracting value from their multimodal data lakes, unstructured or semi structured data for the purpose of solving real world business problems. At Mondeca, he manages a small team of talented research engineers and developers to tackle your next AI project, either purely applied research or business-oriented. He holds a Master research in AI from UPM Madrid, and a Ph.D in computer science from Telecom ParisTech. Homepage