Within the Project PERO, different text recognition datasets are created that are further published. Below you can find links to already available sets.
Please, help us collect handwritten text to improve automatic transcription of historic documents. More
Brno Mobile OCR Dataset
The Brno Mobile OCR Dataset is focused primarily on developing methods for recognizing low quality texts. The dataset contains full pages as well as individual lines with transcripts with varying levels of noise, blur and similar effects. More
Historical Document Classification
We publish the datasets splits that we used for developing the historical document classification system for the ICDAR 2021 Competition on Historical Document Classification. More