Historical document classification
We not only participated in the ICDAR 2021 conference with our published papers, but we also participated in a competition on historical document classification. The competition consisted of three tasks: font/script classification, document localization and dating. In all three tasks, our system took first place and we won the whole competition. A detailed description of our approach is accessible at arxiv.org and it was also accepted to the DAS 2022. When we were pre-processing the provided datasets, we also created their splits into training and validation part. These splits, together with a brief description, are publicly available.
Paper submitted to DAS 2022
At the end of May 2022, the 15th International Workshop on Document Analysis System (DAS) will take place in La Rochelle, France. We will participate in this international workshop with a paper describing our system, which we have prepared for the ICDAR 2021 Competition on Historical Document Classification. In this article, we also publish the datasets splits that we have created, which are necessary for fair comparison with other systems.
Papers submitted to ICDAR 2021
In September 2021, the 16th International Conference on Document Analysis and Recognition (ICDAR) will take place in Lausanne, Switzerland. For the PERO project, we sent three papers to this conference and all them were also accepted. The first article deals with text line detection using a neural network model called ParseNet. The second article focuses on the ability to switch between different outputs of a neural network-based text recognizer using a Transcription-Style block. The last article presents a strategy for effective use of large amounts of unannotated data from a target domain when training a text recognizer.