Projekt PERO

Within the Project PERO, different text recognition datasets are created that are further published. Below you can find links to already available sets.

Handwriting Adaptation Dataset

Handwriting Adaptation Dataset (HAD).

Handwritten dataset

Please, help us collect handwritten text to improve automatic transcription of historic documents. More

The Brno Mobile OCR Dataset is focused primarily on developing methods for recognizing low quality texts. The dataset contains full pages as well as individual lines with transcripts with varying levels of noise, blur and similar effects. More

Historical Document Classification

We publish the datasets splits that we used for developing the historical document classification system for the ICDAR 2021 Competition on Historical Document Classification. More