Description
The STAC dataset is a corpus of strategic chat conversations manually annotated with negotiation-related information, dialogue acts and discourse structures in the framework of Segmented Discourse Representation Theory (SDRT). This dataset was developed within the context of the STAC (Strategic Conversation) project supported by the European Research Council, Grant n. 269427.
This dataset consists of 45 games segmented into Elementary Discourse Units and then annotated using the Glozz tool. The annotations were split into subdocuments to make them easier to work with. The text of each subdocument is associated with two stages of offset annotation in the Glozz XML format:
The STAC dataset is a corpus of strategic chat conversations manually annotated with negotiation-related information, dialogue acts and discourse structures in the framework of Segmented Discourse Representation Theory (SDRT). This dataset was developed within the context of the STAC (Strategic Conversation) project supported by the European Research Council, Grant n. 269427.
This dataset consists of 45 games segmented into Elementary Discourse Units and then annotated using the Glozz tool. The annotations were split into subdocuments to make them easier to work with. The text of each subdocument is associated with two stages of offset annotation in the Glozz XML format:
You can download the Glozz XML format of the latest version of the *linguistic-only* STAC corpus here.
You can download the Glozz XML format of the latest version of the *situated* STAC corpus here.
The annotations have benefitted from several passes---a first one done by annotators hired for the STAC project and subsequent revisions done by SDRT experts. Thanks to Julie Hall, Helen Joseph and especially Lisa Grabow Peterson for the initial round of annotations.
You can download the Glozz XML format of the latest version of the *situated* STAC corpus here.
The annotations have benefitted from several passes---a first one done by annotators hired for the STAC project and subsequent revisions done by SDRT experts. Thanks to Julie Hall, Helen Joseph and especially Lisa Grabow Peterson for the initial round of annotations.
Data Download
Once the annotations were completed, the data were transformed into table format (python pandas dataframes) for easier use.
You can download the latest version of the *linguistic-only* STAC corpus here.
You can download the latest version of the *situated* STAC corpus here.
Once the annotations were completed, the data were transformed into table format (python pandas dataframes) for easier use.
You can download the latest version of the *linguistic-only* STAC corpus here.
You can download the latest version of the *situated* STAC corpus here.
Corpus Visualizations
You can compare the linguistic-only and situated versions of the corpus using the diagrams here.
More explanations are given in the readme file.
You can compare the linguistic-only and situated versions of the corpus using the diagrams here.
More explanations are given in the readme file.
Citing the STAC corpus
If you use the STAC corpus in a scientific publication, we would appreciate citations to the following paper:
If you use the STAC corpus in a scientific publication, we would appreciate citations to the following paper:
Contact information
Nicholas Asher -- lastname[at]irit[dot]fr
Nicholas Asher -- lastname[at]irit[dot]fr
License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
x

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.