Digital content on servers increase the storage and fetching issues. So, researcher works in this field to organize content for fast retrieval with data security. This paper has worked on text digital content retrieval available in form of documents, files. User can search a desired file by test query and relevant list of files get appeared. Keywords were fetched from the text content by removing noisy data during pre-processing. Pre-processed keywords are identified by the number known as term ID. As per the term-ID each text content got a Hash Index which was termed as key numbers in document index. Each term or word has its own identification number known as term Id , so privacy of comparing content terms and user query maintain by hash based searching. As document identification done by hash index key, so storage of text content was done in encrypted numbers once document select for reading then decryption of document applied for a particular user. Experiment was done on real and artificial text content dataset files on different topics. It was obtained that proposed model of Hash indexing and tem based retrieval has improved the privacy with relevancy of as per query.
Keywords
Information Retrieval, Text Feature, Text Mining, Text Ontology