Multimodal LLMs for Historical Dataset Construction from
Archival Image Scans: German Patents (1877–1918)
Niclas Griesshaber & Jochen Streb
📄
AI vs Perfect Transcriptions:
Visual Comparison
1.
Character Error Rate
2.
Patent Entry Extraction based on
Archival Image Scans
3.
Variable Extraction based on extracted
Patent Entries
Output
Full Dataset
📁