Abstract

An acute problem for archives and museums is providing high-quality digitized text from low-quality or damaged originals. The majority of companies working in the field of optical character recognition (OCR) focus on digitzed texts of high quality, which represent the majority of processed data and the market for these companies’ products. As a result, the recognition of very low quality digitized texts, in general, is outside the scope of interest of such companies. In this paper, we analyze several algorithms for recognition of very low quality digitized images of texts and the results of testing the algorithms with the texts.

File
guzhavina.pdf731.65 KB
Issue
Pages
49-61