Associate Professor in Computer Science, IRIT CNRS/Université Paul Sabatier, Toulouse, France

Projects

Currently, I am working in the following projects:

STAC: Strategic Conversation (ERC Advanced Research Grant, 2011-2015)

PI: Nicholas Asher

STAC is a five year interdisciplinary project that aims to develop a new, formal and robust model of conversation, drawing from ideas in linguistics, philosophy, computer science and economics. The project brings a state of the art, linguistic theory of discourse interpretation together with a sophisticated view of agent interaction and strategic decision making, taking advantage of work on game theory. A crucial component of the project's research methodology for advancing our understanding of strategic conversation is to interleave theoretical work and analysis with empirical evaluation and validation using a dialogue manager of a working dialogue system. We will develop different dialogue systems that will provide corpora for studying strategic conversation. We will annotate data from these corpora and to build models of strategic conversational agents based on the data.

Partners: IRIT, Edinburgh University, Heriot Watt University, INRIA

ASFALDA (ANR project 2012-2015)

PI: Marie-Hélène Candito

The ASFALDA project aims to provide both a French corpus with semantic annotations and automatic tools for shallow semantic analysis, using machine learning techniques to train analyzers on this corpus. The target semantic annotations can be characterized roughly as an explicitation of “who does what when and where”, that abstracts away from word order / syntactic variation, and to some of the lexical variation found in natural language.

Partners: Alpage, IRIT(MELODI), Lif, LLF, Ant'inno, CEA LIST

In the past I have worked in the following projects:

ANNODIS (ANR project, 2007-2009)

The linguistic study of discourse is a dynamic and growing field, comprising research in descriptive linguistics, formal semantics and pragmatics, and computational linguistics or NLP. Nevertheless, it is far from being a mature branch of linguistics with well understood paradigms and alternatives running through . The structuring of discourse, a central theme, is for instance analysed by means of diverse notions: discourse relations, theme or topic, information structure, discourse framing, etc. Each hypothesis or theory, naturally enough, tends to concentrate on one aspect of a very complex phenomenon. Given this state of the art, our idea is to develop an empirical program of discourse annotation at different levels and for different phenomena on a diversified corpus of French texts, in order to study the issue of discourse structure and its effects on interpretation from several points of view in a collective fashion. We understand the task of discourse annotation in terms of two subtasks: the delimitation of segments of discourse and the determination of various sorts of hierarchical and semantic/pragmatic relations between these segments. This task cannot be undertaken at a large scale without the help of tools for discourse annotation that come from NLP. We intend to develop such tools in this project. The goal of this project is thus twofold: (i) build a corpus of discursively annotated texts; (ii) develop tools and interfaces to aid in discourse annotation, as well as automated tools for discourse analysis. The tools and the corpus will be made available to the linguistic and NLP communities. Our annotated corpus will help us test and generalize our hypotheses and existing approaches that have motivated our previous work in this area, as well as help others test their own ideas. We expect the corpus to make an important impact on the interaction of theoretical, empirical and descriptive strands of research in this area. We also expect this corpus and tools to benefit several practical areas of NLP: text summarizing, question answering systems, textual entailment and information extraction, which all ideally must take discourse structure into account.

PIITHIE (ANR project, 2007-2008)

Plagiat et Impact de l'Information Textuelle recHerchée dans un contexte InterlinguE: Le projet Piithie s´inscrit dans un mouvement de plus en plus important de maîtrise de l´information diffusée. Il vise premièrement la détection de plagiats de textes. Les techniques de traitement automatique des langues (TAL), devraient permettrent d'améliorer les performances et d'accroître le potentiel de recherche des outils d'Advestigo et de Sinequa. Le deuxième objectif concerne le suivi d´impact : les diffuseurs d'information sont très intéressés par la possibilité d´évaluer l´impact de leur production. Aujourd´hui cette évaluation est faite par une étude manuelle alors que des méthodes automatiques sont possibles. Les traitements nécessaires à ces deux applications sont de même nature , ils demandent seulement un paramétrage différent selon que l´on cherche une copie illégale de l´information ou une utilisation parfaitement légale et dont le contenu peut être très divergent. Les principaux verrous de ce projet concernent 1. la capacité à évaluer la proximité de deux contenus textuels en tenant compte des différents phénomènes de réécriture 2. l'extraction de termes suffisamment représentatifs d'un document pour pouvoir retrouver des documents similaires sur Internet en posant des requêtes à un moteur classique 3. la détection de citations dot il faut tenir compte pour l'évaluation d'impact et qui perturbent la détection de plagiat. Afin de gérer l'ensemble des phénomènes impliqués (réécriture, paraphrase, imitation, etc.) plusieurs types d'analyses linguistiques seront appliqués et testés afin de déterminer quel est leur apport.

Contact Information

Stergos Afantenos,
IRIT Bureau 312,
Université Paul Sabatier,
118 Route de Narbonne,
31062 Toulouse Cedex 04

Tel: +33 (0)5 61 55 77 05
Fax: +33 (0)5 61 55 88 98

If you want to contact me by email, you'll have to figure out the address by yourself :) OK, sorry, here it is: rf.tiri@sonetnafa.sogrets

Last Updated: