Dataset Handle: 20.500.14123/12586
Supplementary Research Materials for the PhD Thesis "Vague, Incomplete, Subjective, and Uncertain Information in Digital History"
Datasets, Software Application, Source Code, and NLP Models
Archiving without access
No downloads available
Chronological data
Date of availability in catalog2026-02-27
Available from / since 2026-02-27
Language of the resource
English
Related PubData resources
Abstract
This collection contains the research materials accompanying the PhD thesis "Vague, Incomplete, Subjective, and Uncertain Information in Digital History" (Mariani, 2026). The materials support the investigation of VISU (Vague, Incomplete, Subjective, and Uncertain) information in art provenance data and document the computational and infrastructural components developed during the project. The archived materials include: (1) Art Institute of Chicago (AIC) Provenance Dataset: Structured provenance events automatically extracted from museum records; (2) NLP Models: spaCy-based models trained on manually annotated AIC provenance texts for sentence boundary detection and span categorisation; (3) PROV-A (Provenance App): A web-based application developed during the PhD for structuring provenance information as Linked Open Data, integrating automated extraction with expert validation; (4) AIC Case Study Data in PROV-A: A curated subset of AIC provenance records processed and supervised using PROV-A. -Together, these materials document the end-to-end workflow proposed in the dissertation, from automated extraction and epistemically aware modelling to human-in-the-loop validation and Linked Open Data publication.
Resource type
Dataset
Software
Software
Kinds of Data
Databases
Programs and Applications
Models / Modellings
Annotations
Programs and Applications
Models / Modellings
Annotations
Methods
Modeling
Programming / Script-based data collection
Programming / Script-based data collection
Thematic classification
Provenienzforschung
Keywords
Provenienz; Kunst; Natural Language Processing; Künstliche Intelligenz; Linked Open Data; Webanwendung; Kunstgeschichte; Provenance; Arts; Natural Language Processing; Artificial Intelligence; Linked Open Data; Web Application; Art History
Notes
"AIC Provenance Dataset" and "NLP Models" are published via Zenodo. "PROV-A" (Provenance App) is available at prov-a.github.io. Its code is published via github, just like "AIC Case Study Data in Prov-A".
Faculty / department
More information
Time Period of the Collection of the Data
Time Period of the Creation of the Dataset
2020 - 2025