Papers submitted to ICDAR 2021
In September 2021, the 16th International Conference on Document Analysis and Recognition (ICDAR) will take place in Lausanne, Switzerland. For the PERO project, we sent three papers to this conference and all them were also accepted. The first article deals with text line detection using a neural network model called ParseNet. The second article focuses on the ability to switch between different outputs of a neural network-based text recognizer using a Transcription-Style block. The last article presents a strategy for effective use of large amounts of unannotated data from a target domain when training a text recognizer.
The EGO-DOK project
The Military Historical Institute in Prague (VHÚ) has launched the EGO-DOK project, the aim of which is to digitize historical documents. After obtaining and scanning documents from institutions or private persons, the obtained data are processed using tools developed within the PERO project. The results of the processing are then handed over to the owner of the document and are also published in the Digital Study Room of the Ministry of Defense of the Czech Republic, similarly to the already processed military diaries.
One of the organizations that use the services of our PERO-OCR automatic handwritten transcription software is Military Historical Institute in Prague (VHÚ). The result of this cooperation is a digitized military diary, which has already been imported into the Digital Study Room of the Ministry of Defense of the Czech Republic. "Můj Deňik", as this document is named, dates back to the First World War. The diary was processed using tools developed within the PERO project, and after importing into the Digital Study Room, it is possible to search in its content, or download the content of individual pages of the diary.