The EGO-DOK project
The Military Historical Institute in Prague (VHÚ) has launched the EGO-DOK project, the aim of which is to digitize historical documents. After obtaining and scanning documents from institutions or private persons, the obtained data are processed using tools developed within the PERO project. The results of the processing are then handed over to the owner of the document and are also published in the Digital Study Room of the Ministry of Defense of the Czech Republic, similarly to the already processed military diaries.
One of the organizations that use the services of our PERO-OCR automatic handwritten transcription software is Military Historical Institute in Prague (VHÚ). The result of this cooperation is a digitized military diary, which has already been imported into the Digital Study Room of the Ministry of Defense of the Czech Republic. "Můj Deňik", as this document is named, dates back to the First World War. The diary was processed using tools developed within the PERO project, and after importing into the Digital Study Room, it is possible to search in its content, or download the content of individual pages of the diary.
In our GitHub repository pero-ocr we have published two models for public usage. The first is a model that is designed to analyze general layout of printed and handwritten pages. The second model is designed for the recognition of European printed text, specialized in Czech newspapers. Both models and the configuration file are compatible with the pero-ocr GitHub "develop" branch.