Strategic Conversation
  • Home
  • About
  • People
  • Publications
  • Corpus
Description 
The STAC dataset is a corpus of strategic chat conversations manually annotated with negotiation-related information, dialogue acts and discourse structures in the framework of Segmented Discourse Representation Theory (SDRT). This dataset was developed within the context of the STAC (Strategic Conversation) project supported by the European Research Council, Grant n. 269427.


This dataset consists of 45 games segmented into Elementary Discourse Units and then annotated using the Glozz tool. The annotations were split into subdocuments to make them easier to work with. The text of each subdocument is associated with two stages of offset annotation in the Glozz XML format:
  • the "units" file contains mentions and anaphoric relations for resources, and dialogue acts,
  • the "discourse" file contains Complex Discourse Units and discourse relations.
  • You can download the Glozz XML format of the latest version of the *linguistic-only* STAC corpus here.
    You can download the Glozz XML format of the latest version of the *situated* STAC corpus here.


    The annotations have benefitted from several passes---a first one done by annotators hired for the STAC project and subsequent revisions done by SDRT experts. Thanks to Julie Hall, Helen Joseph and especially Lisa Grabow Peterson for the initial round of annotations.
    Data Download 
    Once the annotations were completed, the data were transformed into table format (python pandas dataframes) for easier use.

    You can download the latest version of the *linguistic-only* STAC corpus here.
    You can download the latest version of the *situated* STAC corpus here.
    Corpus Visualizations 
    You can compare the linguistic-only and situated versions of the corpus using the diagrams here.
    More explanations are given in the readme file.
    Citing the STAC corpus 
    If you use the STAC corpus in a scientific publication, we would appreciate citations to the following paper:
  • Asher, N., Hunter, J., Morey, M., Benamara, F. & S. Afantenos (2016). Discourse structure and dialogue acts in multiparty dialogue: the STAC corpus. In The Tenth International Conference on Language Resources and Evaluation (LREC 2016). European Language Resources Association, pp. 2721-2727, Portorož.
  • Contact information 
    Nicholas Asher -- lastname[at]irit[dot]fr
    License 
    Creative Commons License
    This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
    x
    Picture
    Create a free web site with Weebly