Research themes

Proxteam's research activities span across several themes that are all based on the use of random walks to exploit the hierarchical small worls properties of terrain networks, in order to model cognitive and semantic aspects of lexical or informational data relations.

 

Lexical networks metrology

This theme seeks to develop tools to manipulate and measure terrain networks. It has both theoretical implications and practical implications. On the one hand the developement of such metrological tools helps us deepen our understanding of HSW phenomena, and on the other hand we use these measures and analysis in the various TerrNet applications and research projects.

Modeling

Most terrain networks share similar hierarchical small world properties, but artificially generating such graphs remains a difficult issue. We propose to use random walks on classical Erdos Reyni networks to create graphs with hierarchical small world properties. Paper Random Walks to Small Worlds 2010

Graph comparison

Two graphs that model a same terrain reality, for example two synonymy graphs of the same language often exhibit a low agreement between their edges. Two words judged synonymous in a dictionary might not be recorded as synonymous in another although both resources have the same standard quality. To recover the notion that the semantics conveyed by such graphs is stable, we develop measures that, beyond edge-to-edge disparities, exhibit a graph similarity that is sensitive to the topological context of edges. Paper TextGraph 2011.

Clustering

Although terrain networks have relatively few edges, one finds dense zones in them: communities of vertices that are much more linked within the community than with the rest of the graph. Identifying such communities, or clusters, is fundamental to understand the organisation of entities in a terrain network. In such networks, furthermore, nodes can belong simultaneously to several communities: we speak about overlapping communities. For example, in a social network, an individual can belong to the community of her colleagues, her family, her hobbie her union... So, we develop clustering methods that not only can perform graph partition, but also searches overlapping communities. Paper FCA 2010

Visualisation/navigation

These metrology studies enable us to model relations between graphs vertices. More specifically we model their semantic proximity in the graph by a random walk-based measure called proxemy. This proxemy measure takes advantage of the graph's topology to explicit the semantic relations of terrain entities that the graph model. We can thus visualise such data in its context, within a repesentation that reflects their semantic distance measured by proxemy. Using Principal Component Analysis, multidimensional data is projected onto an optimal 3D space. With this visualisation, one can intuitively explore terrain networks, as illustrated in the Naviprox lexical application. Paper meaning small worlds 2008

Cognitive ergonomy of information access

The aim of this research theme is to improve access to information by providing interfaces that would be adapted to the humain brain, like once were developped practical tools that were adapted to the human hand. Practically we organize and display search results in a way that explicits the various perspectives that stored information aloows on the user's query. Users are thus enabled to knowingly refine their query according to their actual interest. Several implementations of this theme are hosted by the Kodex application.

Lexical acquisition modelling

We model how children acquire and reproduce their native language lexicon. We model how, from knowing only a few words, children gradually learn the whole extent of the adult lexicon. applications and models of such learning process are demontrated on the Reflex web page.

Analogical metaphor modelling

Why do we understand what is meant by "undressing an apple"? How can we overcome the semantic tension  generated by the couple "undressing" <-> "apple"? Understanding processes that enable such interpretations enables us to model analogical metaphor.

In fact, a speaker that aims at expressing and event A (for example [TO PEEL a apple]) can produce a conventional utterance ("to peel and apple") or a metaphorical one ("to undress an apple"). According to Aristotle (Aristotle, Poetics), a metaphor is built on the basis of a conceptual analogy described by a quadruplet of the type < peel:apple::undress:doll >. The conceptual quadruplet < c1:c2::c3:c4 > is analogical if the relation between c1 and c2 is the same as the relation between c3 and c4 (c1 is for c2 what c3 is for c4). In this case we see that < peel:apple::undress:doll > is analogical because "to peel" is to "an apple" what "to undress" is to "a doll". 

A practical application of this model is SLAM, an automatic metaphor solver: from a non-conventional utterance, the application is able to recover to conventional solution. For example, given, in French, "Les bras de 'arbre" (the arms of the tree), SLAM finds the conventional "les branches de l'arbre" (the branches of the tree).

Paper: Desalle, Y., Gaume, B., and Duvignau, K. (2009). SLAM : Solution Lexicale Automatique pour Métaphore. in Traitement Automatique des Langues, 50(1) :161–182.

English