D02 | Visual Analytics for Linguistic Representations

Prof. Miriam Butt, University of Konstanz
Email | Website

Miriam Butt

Prof. Daniel Weiskopf, University of Stuttgart
Email | Website

Daniel Weiskopf

Dr. Christin Beck, University of Konstanz – EmailWebsite

Within linguistics, the use of large sets of data via a combination of rule-based and stochastic methods is now standardly part of the analysis of language structure. However, though scatter plots, bar or pie charts, and trees as provided by R, for example, are standardly used, novel visual computation techniques have only just begun to be explored. The overall aim of this project is to evaluate whether visual analytics indeed represents a methodology that can yield improved results for linguistic research and to establish metrics for the evaluation of visual analytics approaches by conducting linguistically motivated case studies on historical data.

Research Questions

What visual variables and representations are most effective for which problems?

Which metrics for evaluation can be established?

What visual variables and representations are most effective for which problems?

Can visual analytic methods yield improved results within linguistic research?

Can we find linguistic patterns/insights we could not have found without visual analytics?

Can we find patterns/insights more quickly with visual analytics than without?

Fig. 1: Glyph visualization of dative subjects and semantic verb classes in Icelandic.

Publications

  1. C. Beck and M. Köllner, “GHisBERT -- Training BERT from scratch for lexical semantic investigations across historical German language stages,” in Proceedings of the 4th Workshop on Computational Approaches to Historical Language Change, N. Tahmasebi, S. Montariol, H. Dubossarsky, A. Kutuzov, S. Hengchen, D. Alfter, F. Periti, and P. Cassotti, Eds., in Proceedings of the 4th Workshop on Computational Approaches to Historical Language Change. Singapore: Association for Computational Linguistics, Dec. 2023, pp. 33--45. [Online]. Available: https://aclanthology.org/2023.lchange-1.4
  2. D. Hägele et al., “Uncertainty Visualization: Fundamentals and Recent Developments,” it - Information Technology, vol. 64, no. 4–5, Art. no. 4–5, 2022, doi: 10.1515/itit-2022-0033.
  3. H. Booth and C. Beck, “Verb-second and Verb-first in the History of Icelandic,” Journal of Historical Syntax, vol. 5, no. 27, Art. no. 27, 2021, doi: 10.18148/hs/2021.v5i28.112.
  4. R. Sevastjanova, A.-L. Kalouli, C. Beck, H. Schäfer, and M. El-Assady, “Explaining Contextualization in Language Models using Visual Analytics,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Online: Association for Computational Linguistics, Aug. 2021, pp. 464--476. doi: 10.18653/v1/2021.acl-long.39.
  5. C. Beck, “DiaSense at SemEval-2020 Task 1: Modeling Sense Change via Pre-trained BERT Embeddings,” in Proceedings of the Fourteenth Workshop on Semantic Evaluation, in Proceedings of the Fourteenth Workshop on Semantic Evaluation. Barcelona (online): International Committee for Computational Linguistics, Dec. 2020, pp. 50--58. [Online]. Available: https://www.aclweb.org/anthology/2020.semeval-1.4
  6. C. Schätzle and M. Butt, “Visual Analytics for Historical Linguistics: Opportunities and Challenges,” Journal of Data Mining and Digital Humanities, 2020, doi: 10.46298/jdmdh.6707.
  7. C. Beck, H. Booth, M. El-Assady, and M. Butt, “Representation Problems in Linguistic Annotations: Ambiguity, Variation, Uncertainty, Error and Bias,” in Proceedings of the 14th Linguistic Annotation Workshop, in Proceedings of the 14th Linguistic Annotation Workshop. Barcelona, Spain: Association for Computational Linguistics, Dec. 2020, pp. 60--73. [Online]. Available: https://www.aclweb.org/anthology/2020.law-1.6
  8. C. Schätzle, F. L. Denning, M. Blumenschein, D. A. Keim, and M. Butt, “Visualizing Linguistic Change as Dimension Interactions,” in Proceedings of the International Workshop on Computational Approaches to Historical Language Change, in Proceedings of the International Workshop on Computational Approaches to Historical Language Change. 2019, pp. 272–278. doi: 10.18653/v1/W19-4734.
  9. H. Booth and C. Schätzle, “The Syntactic Encoding of Information Structure in the History of Icelandic,” in Proceedings of the LFG’19 Conference, M. Butt, T. H. King, and I. Toivonen, Eds., in Proceedings of the LFG’19 Conference. CSLI Publications, 2019, pp. 69–89. [Online]. Available: http://web.stanford.edu/group/cslipublications/cslipublications/LFG/LFG-2019/lfg2019-booth-schaetzle.pdf
  10. C. Schätzle and H. Booth, “DiaHClust: an Iterative Hierarchical Clustering Approach for Identifying Stages in Language Change,” in Proceedings of the International Workshop on Computational Approaches to Historical Language Change, in Proceedings of the International Workshop on Computational Approaches to Historical Language Change. Association for Computational Linguistics, 2019, pp. 126–135. doi: 10.18653/v1/W19-4716.
  11. C. Schätzle, “Dative Subjects: Historical Change Visualized,” PhD diss., Universität Konstanz, Konstanz, 2018. [Online]. Available: http://nbn-resolving.de/urn:nbn:de:bsz:352-2-1d917i4avuz1a2
  12. A. Hautli-Janisz, C. Rohrdantz, C. Schätzle, A. Stoffel, M. Butt, and D. A. Keim, “Visual Analytics in Diachronic Linguistic Investigations,” Linguistic Visualizations, 2018.
  13. C. Schätzle, “Genitiv als Stilmittel in der Novelle,” Scalable Reading. Zeitschrift für Literaturwissenschaft und Linguistik (LiLi), vol. 47, pp. 125–140, 2017, doi: 10.1007/s41244-017-0043-9.
  14. C. Schätzle, M. Hund, F. L. Dennig, M. Butt, and D. A. Keim, “HistoBankVis: Detecting Language Change via Data Visualization,” in Proceedings of the NoDaLiDa 2017 Workshop Processing Historical Language, G. Bouma and Y. Adesam, Eds., in Proceedings of the NoDaLiDa 2017 Workshop Processing Historical Language. Linköping University Electronic Press, 2017, pp. 32–39. [Online]. Available: https://www.aclweb.org/anthology/W17-0507
  15. H. Booth, C. Schätzle, K. Börjars, and M. Butt, “Dative Subjects and the Rise of Positional Licensing in Icelandic,” in Proceedings of the LFG’17 Conference, in Proceedings of the LFG’17 Conference. 2017, pp. 104–124. [Online]. Available: http://web.stanford.edu/group/cslipublications/cslipublications/LFG/LFG-2017/lfg2017-bsbb.pdf
  16. C. Schätzle and D. Sacha, “Visualizing Language Change: Dative Subjects in Icelandic,” in Proceedings of the LREC 2016 Workshop VisLRII: Visualization as Added Value in the Development, Use and Evaluation of Language Resources, in Proceedings of the LREC 2016 Workshop VisLRII: Visualization as Added Value in the Development, Use and Evaluation of Language Resources. 2016, pp. 8–15. [Online]. Available: http://www.lrec-conf.org/proceedings/lrec2016/workshops/LREC2016Workshop-VisLR%20II_Proceedings.pdf
  17. C. Schulz et al., “Generative Data Models for Validation and Evaluation of Visualization Techniques,” in Proceedings of the Workshop on Beyond Time and Errors: Novel Evaluation Methods for Visualization (BELIV), in Proceedings of the Workshop on Beyond Time and Errors: Novel Evaluation Methods for Visualization (BELIV). ACM, 2016, pp. 112–124. doi: 10.1145/2993901.2993907.

Project Group A

Models and Measures

 

Completed

 

Project Group B

Adaptive Algorithms

 

Completed

 

Project Group C

Interaction

 

Completed

 

Project Group D

Applications

 

Completed