Brno Mobile OCR Dataset (B-MOD) is a collection of 2 113 templates (pages of scientific papers). Those templates were captured using 23 various mobile devices under unrestricted conditions ensuring that the obtained photographs contain various amount of blurriness, illumination etc. In total, the dataset contains 19 725 photographs and more than 500k text lines with precise transcriptions. The template pages are divided into three subsets (training, validation and testing).

This dataset may be used for non-commercial research purpose only. If you publish material based on this dataset, we request you to include a reference to the paper:

M. Kišš, M. Hradiš, and O. Kodym, “Brno Mobile OCR Dataset” in 2019 15th IAPR International Conference on Document Analysis and Recognition (ICDAR), IEEE, 2019.

You can download the dataset and evaluate your OCR system below. Our OCR system is available on the github. If you have any question, please contact ikiss@fit.vutbr.cz or ihradis@fit.vutbr.cz.

Download

Samples

Evaluate

You can evaluate your OCR system using the form below. Fill your name or name of your team to identify your results. Please, enter a short description of your system or a link to the description.

Please, upload a single text file where each line corresponds to one transcribed line of the test set with the same formatting as in the text files for training and validation lines in the "Cropped lines with transcriptions" ZIP archive. The formating must follow pattern:

filename transcription

e.g.

6149958838f466bbb508399a83bbeb5c.jpg_rec_l0004.jpg Theorems 1 and 2 show that, in checking for deadlock or

Upload

Leaderboard

Name Description Date Easy Medium Hard Overall
CER WER CER WER CER WER CER WER
Baseline LSTM CNN_LSTM_CTC 30.06.2019 0.33 1.93 5.65 22.39 32.28 72.63 3.15 10.71
Baseline Conv CNN_CTC 30.06.2019 0.50 2.79 7.82 28.50 39.76 80.69 4.19 13.39
Thales of Miletus Baseline CRNN (Random splitting) 10.09.2019 0.07 0.37 1.39 6.04 14.73 39.83 1.03 3.61
Michal Hradis Original CTC LSTM network from the paper decoded using beam search and dictionary. The dictionary is generated from the train/val dataset splits. Implementation from https://github.com/githubharald/CTCWordBeamSearch. 10.09.2019 1.70 6.99 5.46 16.19 33.37 60.42 4.05 11.81
SunBear CRNN 12.09.2019 0.24 1.41 4.25 17.94 27.72 68.21 2.50 8.90
Tesseract https://github.com/tesseract-ocr/tesseract config = ("-l eng --oem 1 --psm 7") 12.09.2019 12.32 24.21 45.00 71.06 79.17 100.87 24.47 40.91
Thales of Miletus - 1 Baseline CRNN (Random splitting) + WordBeamSearch 13.09.2019 0.06 0.37 1.38 6.00 14.64 39.50 1.03 3.58
Attention Conv-LSTM Fairly stanndard seq2seq Conv-LSTM model with attention. 16.09.2019 0.70 1.23 3.97 10.66 20.19 47.82 2.42 5.84
Sayan StaquResearch CRNN_CTC 07.10.2019 0.05 0.32 1.13 5.22 11.30 32.45 0.81 3.04
Sayan Mandal StaquResearch customCNN_LSTM_CTC. No augmentation or LM. 13.11.2019 N/A N/A N/A N/A N/A N/A N/A N/A