IRIT - UMR 5505

CNRS
INPT
UPS
UT1
  Bandeau IRIT
 

 

 

XFIRM

XML Flexible Information Retrieval Model

 

 

 Model

XFIRM is a search engine for semi-structured or structured documents (like XML documents for example). It uses the structural information of documents to focus on the user information need. Results are documents parts (XML elements) that are both specific and exhaustive to the information need.

XFIRM is based on a query language allowing users to express their need:

  • with simple keywords
  • with keywords and structural constraints.

Search is done using relevance propagation. Using the tree representation of XML documents, relevance of elements is evaluated as follows:

  • relevance of leaf nodes (nodes containing textual information) is evaluated
  • relevance scores of leaf nodes are then propagated and aggregated to evaluate the relevance of inner nodes.

  Scientific references

 Results

XFIRM has been evaluated within the INEX (INitiative for the Evaluation of XML retrieval) framework between 2004 and 2007. Results show the interest of the method.

XFIRM has for example been ranked 1rst for the VSCAS task of INEX 2005.

  See for example:

  • Karen Sauvagnat, Lobna Hlaoua, Mohand Boughanem. XFIRM at INEX 2005: adhoc and relevance feedback tracks. INitiative for the Evaluation of XML Retrieval (INEX 2005), Dagstuhl, Germany, 28/11/2005-30/11/2005, Vol. 3977, Norbert Fuhr, Mounia Lalmas, Saadia Malik, Gabriella Kazai (Eds.), Springer, LNCS, p. 88-103, novembre 2005.
    http://www.irit.fr/~Karen.Pinel-Sauvagnat/fichiers/xfirm@inex05.pdf

  • Karen Sauvagnat, Mohand Boughanem. Using a relevance propagation method for Adhoc and Heterogeneous tracks in INEX 2004. INitiative for the Evaluation of XML Retrieval (INEX 2004), Dagstuhl, Allemagne, 06/12/2004-08/12/2004, Nobert Fuhr, Mounia Lalmas, Saadia Malik, Zoltan Szlavik (Eds.), Springer, LNCS, p. 337-348, décembre 2004.
    http://www.irit.fr/~Karen.Pinel-Sauvagnat/fichiers/xfirm@inex04.pdf

 Extensions

 Relevance Feedback (RF)

Structure of documents can also be used to refine the user query. Lobna Hlaoua, in her PhD thesis, proposed three approaches for XML Relevance Feedback: content-oriented RF, structure-oriented RF and content-and-structure-oriented RF. These approaches have been included in the XFIRM system.

She proved according to experiments carried out on INEX collections the utility and benefits of the addition of relevant structures extracted according to her SCA algorithm. She has also showed that the combination of structure and content to reformulate queries can be advantageous.

  Scientific references

  • Lobna Hlaoua, Mohand Boughanem, Karen Pinel-Sauvagnat. Combination of Evidences in Relevance Feedback for XML Retrieval. Conference on Information and Knowledge Management (CIKM 2007), Lisbonne, Portugal, 06/11/2007-09/11/2007, ACM Press, p. 893-896, novembre 2007.

  • Lobna Hlaoua, Refomulation de requêtes par réinjection de pertinence dans les documents semi-structurés. Thèse de doctorat, Université Paul Sabatier, Décembre 2007.
    http://www.irit.fr/SIG_RFI/fichiers/Hlaoua.pdf

 Multimedia Retrieval

Mouna Torjmen, in her PhD thesis, investigates the use of XML structure in multimedia retrieval, particularly in context-based image retrieval. She proposes methods to represent multimedia elements:

  • the first one is based on implicit use of textual and structural context of multimedia elements. It consists of representing multimedia elements through their children, brother and ancestor elements.
  • the second one makes an explicit use of both sources. It is based on an analogy between the XML document tree and an ontology. It consists of defining a measure to compute the participation degree of each textual node in the multimedia element representation. This measure is defined thanks to similarity measures applied on ontologies, and more precisely on those based on edge-counting measures between two concepts.

Both propositions are implemented in the XFIRM system.

  Scientific references

  • Mouna Torjmen, Karen Pinel-Sauvagnat, Mohand Boughanem. XML Multimedia Retrieval: From relevant textual information to relevant multimedia fragments. European Conference on Information Retrieval (ECIR 2009), TOULOUSE, 06/04/2009-09/04/2009, Springer, p. 150-161, 2009.

  • Mouna Torjmen, Karen Pinel-Sauvagnat, Mohand Boughanem. Towards a structure-based multimedia retrieval model. ACM International Conference on Multimedia Information Retrieval, Vancouver, Canada, 30/10/2008-31/10/2008, ACM, octobre 2008.
    http://portal.acm.org/citation.cfm?id=1460153

 Contact

Karen Pinel-Sauvagnat
SIG Team, IRIT,
118 route de Narbonne
31062 Toulouse Cedex 9, France

http://www.irit.fr/~Karen.Pinel-Sauvagnat