CoST dataset

This is the main dataset designed during the CoST project:
Publication: Dosso C,  Moreno JG, Chevalier A, and Tamine L. 2021. CoST: An annotated Data Collection for Complex Search. CIKM2021.


Other datasets

Approaches developed during the project might be evaluated using the following benchmarks/collections:

[1] Lucchese, C., Orlando, S., Perego, R., Silvestri, F., and Tolomei, G. Discovering User Tasks from Search Engine Query Logs. In ACM Transactions on Information Systems (ACM TOIS), vol. 31, issue 3 – July 2013, pp. 14:1–14:43.

[2] Sen, Procheta, Ganguly, Debasis  and Jones, Gareth J.F.  (2018) Tempo-lexical context driven word embedding for cross-session search task extraction. In: 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1-6 June 2018, New Orleans, LA, USA

[3] Michael Völske, Ehsan Fatehifar, Benno Stein, and Matthias Hagen. 2019. Query-Task Mapping. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR’19). Association for Computing Machinery, New York, NY, USA, 969–972.