Evaluating Vision-Language Models for WCAG-Compliant Alt-Text in Digital Heritage Collections
Seeing History Unseen
Evaluating vision–language models (VLMs) for alt-text in digital heritage collections
Case Study
Focus
Stadt.Geschichte.Basel
Open Research Data Platform
WCAG-compliant,
historically informed alt-text for heterogeneous images
Motivation: Accessibility Gap
Alt-Text in Digital Heritage
Opportunity and Risks of VLMs
Multimodal models can caption images at scale
(e.g., GPT-4o mini, Gemini, Llama, Qwen)
Promises
Risks
Research Questions
Corpus and Dataset
Stadt.Geschichte.Basel
Open Research Data Platform
≈1900 media objects (as of Dec 2025)