Robust Extraction of Recognition Primitives for the Preprocessing of Form Documents

Youngtae Chung, Kwanyong Lee, Hyeran Byun, Yillbyung Lee

Dept. of Computer Science, Yonsei University, Seoul, Korea

It is very common that filled-in character images in form documents are transformed by formatted-line images. By touching, crossing, and overlapping with formatted-lines, the shapes of characters are changed.

In this paper, the new method which can restore characters from damaged images by lines is proposed. Throughout the two stages, the character decomposition stage and the character restoration stage, the characters with correct shapes can be extracted from damaged images.

To evaluate the proposed method objectively, we used two simple recognition modules on CENPARMI handwritten digits and NIST handwritten alphabets. Our results showed that the difference of the recognition rates between the original characters and the characters restored by the proposed method is within about 1%. From our experiments we knew that the proposed method could extract the characters which are almost identical to the original images.


GREC'97 program