Abstract
Numerous techniques have been reported for optical character recognition (OCR). Almost all such techniques make an implicit assumption that the language of the document to be processed is known. We attempt to eliminate this assumption by presenting a novel algorithm for automatic written language recognition. Given that different languages are often visually distinctive in written form, we take a global approach based on texture analysis, where each language is regarded as a different texture. In principle this allows us to apply any standard texture recognition algorithm for the task. Experiments with six languages clearly demonstrate the great potential of the proposed global approach.

This publication has 5 references indexed in Scilit: