Paper 5 – TLDKS Journal

Empirical Study of the Model Generalization for Argument Mining in Cross-Domain and Cross-Topic Settings

Authors: Alaa Alhamzeh, Előd Egyed-Zsigmond, Dorra El Mekki, Abderrazzak El Khayari, Jelena Mitrović, Lionel Brunie et al.

Volume 52 (2022)

Abstract

To date, the number of studies that address the generalization of argument models is still relatively small. In this study, we extend our stacking model from argument identification to an argument unit classification task. Using this model, and for each of the learned tasks, we address three real-world scenarios concerning the model robustness over multiple datasets, different domains and topics. Consequently, we first compare single-datset learning (SDL) with multi-dataset learning (MDL). Second, we examine the model generalization over completely unseen dataset in our cross-domain experiments. Third, we study the effect of sample and topic sizes on the model performance in our cross-topic experiments. We conclude that, in most cases, the ensemble learning stacking approach is more stable over the generalization tests than a transfer learning DistilBERT model. In addition, the argument identification task seems to be easier to generalize across shifted domains than argument unit classification. This work aims at filling the gap between computational argumentation and applied machine learning with regard to the model generalization.

Keywords

Argument mining, Robustness, Generalization, Multi-dataset learning, Cross-domain, Cross-topic