Project D02

D02 | Visual Analytics for Linguistic Representations

Prof. Miriam Butt, University of Konstanz
Email | Website

Prof. Daniel Weiskopf, University of Stuttgart
Email | Website

Dr. Christin Beck, University of Konstanz – Email | Website

Farai Grenzdörffer, University of Konstanz – Email | Website

Benazir Mumtaz, University of Konstanz – Email | Website

Mark-Matthias Zymla, University of Konstanz – Email | Website

About this project
Results

Within linguistics, the use of large sets of data via a combination of rule-based and stochastic methods is now standardly part of the analysis of language structure. However, though scatter plots, bar or pie charts, and trees as provided by R, for example, are standardly used, novel visual computation techniques have only just begun to be explored. The overall aim of this project is to evaluate whether visual analytics indeed represents a methodology that can yield improved results for linguistic research and to establish metrics for the evaluation of visual analytics approaches by conducting linguistically motivated case studies on historical data.

Research Questions

What visual variables and representations are most effective for which problems?

Which metrics for evaluation can be established?

What visual variables and representations are most effective for which problems?

Can visual analytic methods yield improved results within linguistic research?

Can we find linguistic patterns/insights we could not have found without visual analytics?

Can we find patterns/insights more quickly with visual analytics than without?

Fig. 1: Glyph visualization of dative subjects and semantic verb classes in Icelandic.

Publications

M. Butt, L. Carnesale, and T. Ahmed, “Experiencers vs. agents in Urdu/Hindi nominalized verbs of perception,” in Proceedings of the Lexical Functional Grammar Conference, 2023, pp. 90–113. [Online]. Available: https://lfg-proceedings.org/lfg/index.php/main/article/view/46
- BibTeX
- Link
BibTeX
@inproceedings{butt2023experiencers, affiliation = {Butt, Miriam, Universtität Konstanz}, author = {Butt, Miriam and Carnesale, Lucrezia and Ahmed, Tafseer}, booktitle = {Proceedings of the Lexical Functional Grammar Conference}, pages = {90-113}, title = {Experiencers vs. agents in Urdu/Hindi nominalized verbs of perception}, url = {https://lfg-proceedings.org/lfg/index.php/main/article/view/46}, volume = 28, year = 2023 }
Link
https://lfg-proceedings.org/lfg/index.php/main/article/view/46
C. Beck and M. Köllner, “GHisBERT – Training BERT from scratch for lexical semantic investigations across historical German language stages,” in Proceedings of the 4th Workshop on Computational Approaches to Historical Language Change, N. Tahmasebi, S. Montariol, H. Dubossarsky, A. Kutuzov, S. Hengchen, D. Alfter, F. Periti, and P. Cassotti, Eds., Singapore: Association for Computational Linguistics, Dec. 2023, pp. 33–45. [Online]. Available: https://aclanthology.org/2023.lchange-1.4
- BibTeX
- Link
BibTeX
@inproceedings{beck:2023:ghisbert, address = {Singapore}, author = {Beck, Christin and Köllner, Marisa}, booktitle = {Proceedings of the 4th Workshop on Computational Approaches to Historical Language Change}, editor = {Tahmasebi, Nina and Montariol, Syrielle and Dubossarsky, Haim and Kutuzov, Andrey and Hengchen, Simon and Alfter, David and Periti, Francesco and Cassotti, Pierluigi}, month = {12}, pages = {33-45}, publisher = {Association for Computational Linguistics}, title = {GHisBERT – Training BERT from scratch for lexical semantic investigations across historical German language stages}, url = {https://aclanthology.org/2023.lchange-1.4}, year = 2023 }
Link
https://aclanthology.org/2023.lchange-1.4
D. Hägele et al., “Uncertainty Visualization: Fundamentals and Recent Developments,” it - Information Technology, vol. 64, pp. 121–132, 2022, doi: 10.1515/itit-2022-0033.
- BibTeX
- Link
BibTeX
@article{haegeleIt2022, affiliation = {Hägele, David, Visualisierungsinstitut der Universität Stuttgart. Schulz, Christoph, Visualisierungsinstitut der Universität Stuttgart. Butt, Miriam, Universtität Konstanz. Deussen, Oliver, Universität Konstanz. Weiskopf, Daniel, Visualisierungsinstitut der Universität Stuttgart}, author = {Hägele, David and Schulz, Christoph and Beschle, Cedric and Booth, Hannah and Butt, Miriam and Barth, Andrea and Deussen, Oliver and Weiskopf, Daniel}, doi = {10.1515/itit-2022-0033}, journal = {it - Information Technology}, orcid-numbers = {Hägele, David/0000-0002-2679-6882, Schulz, Christoph/0000-0001-5771-3966, Deussen, Oliver/0000-0001-5803-2185, Weiskopf, Daniel/0000-0003-1174-1026}, pages = {121-132}, title = {Uncertainty Visualization: Fundamentals and Recent Developments}, url = {https://doi.org/10.1515/itit-2022-0033}, volume = 64, year = 2022 }
Link
https://doi.org/10.1515/itit-2022-0033
R. Sevastjanova, A.-L. Kalouli, C. Beck, H. Schäfer, and M. El-Assady, “Explaining Contextualization in Language Models using Visual Analytics,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online: Association for Computational Linguistics, Aug. 2021, pp. 464–476. [Online]. Available: https://aclanthology.org/2021.acl-long.39
- BibTeX
- Link
BibTeX
@inproceedings{sevastjanova-etal-2021-explaining, address = {Online}, author = {Sevastjanova, Rita and Kalouli, Aikaterini-Lida and Beck, Christin and Schäfer, Hanna and El-Assady, Mennatallah}, booktitle = {Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)}, month = {08}, pages = {464-476}, publisher = {Association for Computational Linguistics}, title = {Explaining Contextualization in Language Models using Visual Analytics}, url = {https://aclanthology.org/2021.acl-long.39}, year = 2021 }
Link
https://aclanthology.org/2021.acl-long.39
H. Booth and C. Beck, “Verb-second and Verb-first in the History of Icelandic,” Journal of Historical Syntax, vol. 5, Art. no. 27, 2021, [Online]. Available: https://ojs.ub.uni-konstanz.de/hs/index.php/hs/article/view/112
- BibTeX
- Link
BibTeX
@article{booth2021verbsecond, author = {Booth, Hannah and Beck, Christin}, journal = {Journal of Historical Syntax}, number = 27, pages = {1-53}, title = {Verb-second and Verb-first in the History of Icelandic}, url = {https://ojs.ub.uni-konstanz.de/hs/index.php/hs/article/view/112}, volume = 5, year = 2021 }
Link
https://ojs.ub.uni-konstanz.de/hs/index.php/hs/article/view/112
C. Beck, “DiaSense at SemEval-2020 Task 1: Modeling Sense Change via Pre-trained BERT Embeddings,” in Proceedings of the Fourteenth Workshop on Semantic Evaluation, Barcelona (online): International Committee for Computational Linguistics, Dec. 2020, pp. 50–58. [Online]. Available: https://www.aclweb.org/anthology/2020.semeval-1.4
- BibTeX
- Link
BibTeX
@inproceedings{beck-2020-diasense, address = {Barcelona (online)}, author = {Beck, Christin}, booktitle = {Proceedings of the Fourteenth Workshop on Semantic Evaluation}, month = {12}, pages = {50-58}, publisher = {International Committee for Computational Linguistics}, title = {DiaSense at SemEval-2020 Task 1: Modeling Sense Change via Pre-trained BERT Embeddings}, url = {https://www.aclweb.org/anthology/2020.semeval-1.4}, year = 2020 }
Link
https://www.aclweb.org/anthology/2020.semeval-1.4
C. Beck, H. Booth, M. El-Assady, and M. Butt, “Representation Problems in Linguistic Annotations: Ambiguity, Variation, Uncertainty, Error and Bias,” in Proceedings of the 14th Linguistic Annotation Workshop, Barcelona, Spain: Association for Computational Linguistics, Dec. 2020, pp. 60–73. [Online]. Available: https://www.aclweb.org/anthology/2020.law-1.6
- BibTeX
- Link
BibTeX
@inproceedings{beck-etal-2020-representation, address = {Barcelona, Spain}, affiliation = {Butt, Miriam, Universtität Konstanz}, author = {Beck, Christin and Booth, Hannah and El-Assady, Mennatallah and Butt, Miriam}, booktitle = {Proceedings of the 14th Linguistic Annotation Workshop}, month = {12}, pages = {60-73}, publisher = {Association for Computational Linguistics}, title = {Representation Problems in Linguistic Annotations: Ambiguity, Variation, Uncertainty, Error and Bias}, url = {https://www.aclweb.org/anthology/2020.law-1.6}, year = 2020 }
Link
https://www.aclweb.org/anthology/2020.law-1.6
C. Schätzle and M. Butt, “Visual Analytics for Historical Linguistics: Opportunities and Challenges,” Journal of Data Mining and Digital Humanities, 2020, [Online]. Available: https://jdmdh.episciences.org/6968
- BibTeX
- Link
BibTeX
@article{Schatzle2020Visua-50596, affiliation = {Butt, Miriam, Universtität Konstanz}, author = {Schätzle, Christin and Butt, Miriam}, journal = {Journal of Data Mining and Digital Humanities}, title = {Visual Analytics for Historical Linguistics: Opportunities and Challenges}, url = {https://jdmdh.episciences.org/6968}, year = 2020 }
Link
https://jdmdh.episciences.org/6968
C. Schätzle and H. Booth, “DiaHClust: an Iterative Hierarchical Clustering Approach for Identifying Stages in Language Change,” in Proceedings of the International Workshop on Computational Approaches to Historical Language Change, Association for Computational Linguistics, 2019, pp. 126–135. [Online]. Available: https://www.aclweb.org/anthology/W19-4716
- BibTeX
- Link
BibTeX
@inproceedings{schatzle-booth-2019-diahclust, author = {Schätzle, Christin and Booth, Hannah}, booktitle = {Proceedings of the International Workshop on Computational Approaches to Historical Language Change}, pages = {126-135}, publisher = {Association for Computational Linguistics}, title = {DiaHClust: an Iterative Hierarchical Clustering Approach for Identifying Stages in Language Change}, url = {https://www.aclweb.org/anthology/W19-4716}, year = 2019 }
Link
https://www.aclweb.org/anthology/W19-4716
H. Booth and C. Schätzle, “The Syntactic Encoding of Information Structure in the History of Icelandic,” in Proceedings of the LFG’19 Conference, M. Butt, T. H. King, and I. Toivonen, Eds., CSLI Publications, 2019, pp. 69–89. [Online]. Available: http://web.stanford.edu/group/cslipublications/cslipublications/LFG/LFG-2019/lfg2019-booth-schaetzle.pdf
- BibTeX
- Link
BibTeX
@inproceedings{booth2019syntactic, author = {Booth, Hannah and Schätzle, Christin}, booktitle = {Proceedings of the LFG’19 Conference}, editor = {Butt, Miriam and King, Tracy Holloway and Toivonen, Ida}, pages = {69-89}, publisher = {CSLI Publications}, title = {The Syntactic Encoding of Information Structure in the History of Icelandic}, url = {http://web.stanford.edu/group/cslipublications/cslipublications/LFG/LFG-2019/lfg2019-booth-schaetzle.pdf}, year = 2019 }
Link
http://web.stanford.edu/group/cslipublications/cslipublications/LFG/LFG-2019/lfg2019-booth-schaetzle.pdf
C. Schätzle, F. L. Dennig, M. Blumenschein, D. A. Keim, and M. Butt, “Visualizing Linguistic Change as Dimension Interactions,” in Proceedings of the International Workshop on Computational Approaches to Historical Language Change, 2019, pp. 272–278. [Online]. Available: https://www.aclweb.org/anthology/W19-4734.pdf
- BibTeX
- Link
BibTeX
@inproceedings{schatzle2019visualizing, affiliation = {Keim, Daniel A., Universität Konstanz. Butt, Miriam, Universtität Konstanz}, author = {Schätzle, Christin and Dennig, Frederik L. and Blumenschein, Michael and Keim, Daniel A. and Butt, Miriam}, booktitle = {Proceedings of the International Workshop on Computational Approaches to Historical Language Change}, pages = {272-278}, title = {Visualizing Linguistic Change as Dimension Interactions}, url = {https://www.aclweb.org/anthology/W19-4734.pdf}, year = 2019 }
Link
https://www.aclweb.org/anthology/W19-4734.pdf
C. Schätzle, “Dative Subjects: Historical Change Visualized,” Konstanz, 2018. [Online]. Available: http://nbn-resolving.de/urn:nbn:de:bsz:352-2-1d917i4avuz1a2
- BibTeX
- Link
BibTeX
@phdthesis{schatzle2018dative, address = {Konstanz}, author = {Schätzle, Christin}, month = {12}, title = {Dative Subjects: Historical Change Visualized}, url = {http://nbn-resolving.de/urn:nbn:de:bsz:352-2-1d917i4avuz1a2}, year = 2018 }
Link
http://nbn-resolving.de/urn:nbn:de:bsz:352-2-1d917i4avuz1a2
A. Hautli-Janisz, C. Rohrdantz, C. Schätzle, A. Stoffel, M. Butt, and D. A. Keim, “Visual Analytics in Diachronic Linguistic Investigations,” Linguistic Visualizations, 2018.
- BibTeX
BibTeX
@article{hautlijaniszvisual, affiliation = {Butt, Miriam, Universtität Konstanz. Keim, Daniel A., Universität Konstanz}, author = {Hautli-Janisz, A. and Rohrdantz, Christian and Schätzle, Christin and Stoffel, A. and Butt, Miriam and Keim, Daniel A.}, editor = {Publications, Stanford: CSLI}, journal = {Linguistic Visualizations}, note = {accepted}, title = {Visual Analytics in Diachronic Linguistic Investigations}, year = 2018 }
H. Booth, C. Schätzle, K. Börjars, and M. Butt, “Dative Subjects and the Rise of Positional Licensing in Icelandic,” in Proceedings of the LFG’17 Conference, 2017, pp. 104–124. [Online]. Available: http://web.stanford.edu/group/cslipublications/cslipublications/LFG/LFG-2017/lfg2017-bsbb.pdf
- BibTeX
- Link
BibTeX
@inproceedings{booth2017dative, affiliation = {Butt, Miriam, Universtität Konstanz}, author = {Booth, Hannah and Schätzle, Christin and Börjars, K. and Butt, Miriam}, booktitle = {Proceedings of the LFG’17 Conference}, pages = {104-124}, title = {Dative Subjects and the Rise of Positional Licensing in Icelandic}, url = {http://web.stanford.edu/group/cslipublications/cslipublications/LFG/LFG-2017/lfg2017-bsbb.pdf}, year = 2017 }
Link
http://web.stanford.edu/group/cslipublications/cslipublications/LFG/LFG-2017/lfg2017-bsbb.pdf
C. Schätzle, “Genitiv als Stilmittel in der Novelle,” Scalable Reading. Zeitschrift für Literaturwissenschaft und Linguistik (LiLi), vol. 47, pp. 125–140, 2017, doi: 10.1007/s41244-017-0043-9.
- BibTeX
- Link
BibTeX
@article{schatzle2017genitiv, author = {Schätzle, Christin}, doi = {10.1007/s41244-017-0043-9}, journal = {Scalable Reading. Zeitschrift für Literaturwissenschaft und Linguistik (LiLi)}, pages = {125-140}, title = {Genitiv als Stilmittel in der Novelle}, url = {https://doi.org/10.1007/s41244-017-0043-9}, volume = 47, year = 2017 }
Link
https://doi.org/10.1007/s41244-017-0043-9
C. Schätzle, M. Hund, F. L. Dennig, M. Butt, and D. A. Keim, “HistoBankVis: Detecting Language Change via Data Visualization,” in Proceedings of the NoDaLiDa 2017 Workshop Processing Historical Language, G. Bouma and Y. Adesam, Eds., Linköping University Electronic Press, 2017, pp. 32–39. [Online]. Available: https://www.aclweb.org/anthology/W17-0507
- BibTeX
- Link
BibTeX
@inproceedings{conf/histlang/SchatzleHDBK17, affiliation = {Butt, Miriam, Universtität Konstanz. Keim, Daniel A., Universität Konstanz}, author = {Schätzle, Christin and Hund, Michael and Dennig, Frederik L. and Butt, Miriam and Keim, Daniel A.}, booktitle = {Proceedings of the NoDaLiDa 2017 Workshop Processing Historical Language}, editor = {Bouma, Gerlof and Adesam, Yvonne}, pages = {32-39}, publisher = {Linköping University Electronic Press}, title = {HistoBankVis: Detecting Language Change via Data Visualization}, url = {https://www.aclweb.org/anthology/W17-0507}, year = 2017 }
Link
https://www.aclweb.org/anthology/W17-0507
C. Schätzle and D. Sacha, “Visualizing Language Change: Dative Subjects in Icelandic,” in Proceedings of the LREC 2016 Workshop VisLRII: Visualization as Added Value in the Development, Use and Evaluation of Language Resources, 2016, pp. 8–15. [Online]. Available: http://www.lrec-conf.org/proceedings/lrec2016/workshops/LREC2016Workshop-VisLR%20II_Proceedings.pdf
- BibTeX
- Link
BibTeX
@inproceedings{schatzle2016visualizing, author = {Schätzle, Christin and Sacha, Dominik}, booktitle = {Proceedings of the LREC 2016 Workshop VisLRII: Visualization as Added Value in the Development, Use and Evaluation of Language Resources}, pages = {8-15}, title = {Visualizing Language Change: Dative Subjects in Icelandic}, url = {http://www.lrec-conf.org/proceedings/lrec2016/workshops/LREC2016Workshop-VisLR%20II_Proceedings.pdf}, year = 2016 }
Link
http://www.lrec-conf.org/proceedings/lrec2016/workshops/LREC2016Workshop-VisLR%20II_Proceedings.pdf
C. Schulz et al., “Generative Data Models for Validation and Evaluation of Visualization Techniques,” in Proceedings of the Workshop on Beyond Time and Errors: Novel Evaluation Methods for Visualization (BELIV), ACM, 2016, pp. 112–124. doi: 10.1145/2993901.2993907.
- BibTeX
- Link
BibTeX
@inproceedings{2016SchulzGenerativeDataModels, affiliation = {Schulz, Christoph, Visualisierungsinstitut der Universität Stuttgart. Frey, Steffen, Visualisierungsinstitut der Universität Stuttgart. Karch, Grzegorz Karol, Visualisierungsinstitut der Universität Stuttgart. Butt, Miriam, Universtität Konstanz. Keim, Daniel A., Universität Konstanz. Ertl, Thomas, Visualisierungsinstitut der Universität Stuttgart. Weiskopf, Daniel, Visualisierungsinstitut der Universität Stuttgart}, author = {Schulz, Christoph and Nocaj, Arlind and El-Assady, Mennatallah and Frey, Steffen and Hlawatsch, Marcel and Hund, Michael and Karch, Grzegorz Karol and Netzel, Rudolf and Schätzle, Christin and Butt, Miriam and Keim, Daniel A. and Ertl, Thomas and Brandes, Ulrik and Weiskopf, Daniel}, booktitle = {Proceedings of the Workshop on Beyond Time and Errors: Novel Evaluation Methods for Visualization (BELIV)}, doi = {10.1145/2993901.2993907}, orcid-numbers = {Schulz, Christoph/0000-0001-5771-3966, Frey, Steffen/0000-0002-1872-6905, Karch, Grzegorz Karol/0000-0002-1801-8642, Ertl, Thomas/0000-0003-4019-2505, Weiskopf, Daniel/0000-0003-1174-1026}, pages = {112-124}, privnote = {uid:2282}, publisher = {ACM}, title = {Generative Data Models for Validation and Evaluation of Visualization Techniques}, url = {http://dx.doi.org/10.1145/2993901.2993907}, year = 2016 }
Link
http://dx.doi.org/10.1145/2993901.2993907