Text extraction in document images has been an important research area. Extraction of the information in the form of text involves detection, localization, tracking, extraction, enhancement, and recognition of the text from a given document image. A large number of techniques have been proposed to address this problem. In this paper a novel method is proposed by using three features extraction techniques i.e. Gabor, Wavelet and Hough to detect text objects from document images. The performance of the proposed method is tested on NIST document Image dataset.
Keywords
Document Image Analysis (DIA), Text Extraction, Text Detection, Text Localization, Text Enhancement, Gabor, Wavelet, Hough and Canny