19 manuscripts in various European languages and scripts.
More information together with adaptation fine-tuning experiments of a general model trained on a large handwriting dataset can be found here: Finetuning Is a Surprisingly Effective Domain Adaptation Baseline in Handwriting Recognition
There are two directories in the dataset archive: data and runs.
data contains images of text lines and their respective transcriptions. The images are in three multiple crop modes: tight, medium, and wide, the crop mode indicates how much space was left around the baseline during the cropping process. Transcriptions are in the following format: ID TRANS, where the ID corresponds to the name of the respective text line image and TRANS is the transcription.
runs contains partitions for fine-tuning runs, more information in referenced paper, Section 5.
- Dataset: download