D03 | Visual Exploration and Analysis of Provenance Data

Prof. Melanie Herschel, Universität Stuttgart
Email | Website

Melanie Herschel

Prof. Ulrik Brandes, University of Konstanz, ETH Zürich
Email | Website

Ulrik Brandes

Houssem Ben Lahmar, Universität Stuttgart – Email | Website

To analyze or debug complex data processing applications, or to ensure their understandability and repeatability, provenance techniques are increasingly being deployed, resulting in large volumes and a wide variety of provenance data. The long-term goal of this project is to leverage visualization techniques to efficiently and effectively explore provenance data. In the first funding period, we will focus on properly visualizing the full provenance data generated for one run of a data-processing pipeline. This involves both quantifiably identifying suited visualizations for various provenance types and ensuring user-friendly provenance data generation and visualization in existing data processing pipelines.

Research Questions

What are suitable visualization techniques for different settings defined by varying types of provenance and applications?

Which metrics can quantitatively assess provenance data visualization quality?

How can such metrics support tuning processes generating and managing provenance data?

Which types of provenance are best suited to achieve the goals of reproducibility and predictability for selected visual computing processes?

Visualizing and Interacting with Provenance Data


  1. Oppold, S., & Herschel, M. (2018). “Provenance for entity resolution.” Proceedings of the International Provenance and Annotation Workshop, 226–230.
  2. Ben Lahmar, Houssem, & Herschel, M. (2017). Provenance-based Recommendations for Visual Data Exploration. International Workshop on Theory and Practice of Provenance (TAPP).
  3. Herschel, M., Diestelkämper, R., & Ben Lahmar, H. (2017). A survey on provenance - What for? What form? What from? The International Journal on Very Large Data Bases (VLDB Journal).
  4. Baazizi, M. A., Ben Lahmar, H., Colazzo, D., Ghelli, G., & Sartiani, C. (2017). Schema Inference for Massive JSON Datasets. Conference on Extending Database Technology (EDBT), 222–233.
  5. Herschel, M., & Hlawatsch, M. (2016). Provenance: On and Behind the Screens. In F. Özcan, G. Koutrika, & S. Madden (Eds.), ACM International Conference on the Management of Data (SIGMOD) (pp. 2213–2217; By F. Özcan, G. Koutrika, & S. Madden). Retrieved from http://dblp.uni-trier.de/db/conf/sigmod/sigmod2016.html#HerschelH16
  6. Schulz, C., Zeyfang, A., van Garderen, M., Ben Lahmar, H., Herschel, M., & Weiskopf, D. (2018). Simultaneous Visual Analysis of Multiple Software Hierarchies. 2018 IEEE Working Conference on Software Visualization (VISSOFT), 87--95. https://doi.org/10.1109/VISSOFT.2018.00017
  7. Diestelkämper, R., Herschel, M., & Jadhav, P. (2017). Provenance in DISC Systems: Reducing Space Overhead at Runtime. International Workshop on Theory and Practice of Provenance (TAPP).
  8. Ben Lahmar, H., Herschel, M., Blumenschein, M., & Keim, D. A. (2018). “Provenance-based visual data exploration with EVLIN.” Proceedings of the International Conference on Extending Database Technology, 686–689. International Conference on Extending Database Technology.