Time Reduction Mechanism in Information Extraction Using Parse Tree Query Language

K. Venkatesh; Mr. B. Vijaya Bhaskar Reddy

Abstract

Information extraction (IE) is the task of automatically extracting structured information from unstructured and semi-structured machinereadable document. In this paper, we propose a new paradigm for information extraction. In this extraction framework, intermediate output of each text processing component is stored so that only the improved component has to be deployed to the entire corpus. Extraction is then performed on both the previously processed data from the unchanged components as well as the updated data generated by the improved component. Performing such kind of incremental extraction can result in a tremendous reduction of processing time. To realize this new information extraction framework, we propose to choose database management systems over filebased storage systems to address the dynamic extraction needs. To demonstrate the feasibility of incremental extraction approach, experiments are performed to highlight two important aspects of an information extraction system: efficiency and quality of extraction results.

Keywords

Text mining, query languages, information storage and retrieval

References

[1] D. Ferrucci and A. Lally, “UIMA: An Architectural Approach to Unstructured Information Processing in the Corporate Research Environment,” Natural Language Eng., vol. 10, nos. 3/4, pp. 327-348, 2004.

[2] E. Agichtein and L. Gravano, “Snowball: Extracting Relations from Large Plain-Text Collections,” Proc. Fifth ACM Conf. Digital Libraries, pp. 85-94, 2000.

[3] M. Banko, M.J. Cafarella, S. Soderland, M. Broadhead, and O. Etzioni, “Open Information Extraction from the Web,” Proc. Joint Conf. Artificial Intelligence (IJCAI), 2007.

[4] W. Baumgartner, Z. Lu, H. Johnson, J. Caporaso, J.Paquette, E. White, O. Medvedeva, K. Cohen, and Hunter, “An Integrated Approach to Concept Recognition in Biomedical Text,” Proc. Second Bio Creative Challenge, 2006.

[5] S. Bird, Y. Chen, S.B. Davidson, H. Lee, and Y. Zheng, Extending XPath to Support Linguistic Queries,” Proc. Workshop Programming Language Technologies for XML (PLAN-X), 2005.

[6] M. Cafarella, D. Downey, S. Soderland, and O. Etzioni, “Knowitnow: Fast, Scalable Information Extraction from the Web,” Proc. Conf. Human Language Technology and Empirical Methods in Natural Language Processing (HLT ’05), pp. 563-570, 2005.

[7] J.T. Chang and R.B. Altman, “Extracting and Characterizing GeneDrug Relationships from the Literature,” Pharmacogenetics, vol. 14, no. 9, pp. 577-586, Sept. 2004.

[8] F. Chen, A. Doan, J. Yang, and R. Ramakrishnan, “Efficient Information Extraction over Evolving Text Data,” Proc IEEE 24th Int’l Conf. Data Eng. (ICDE ’08), pp. 943-952, 2008.

[9] F. Chen, B. Gao, A. Doan, J. Yang, and R. Ramakrishnan, “Optimizing Complex Extraction Programs over Evolving Text Data,” Proc 35th ACM SIGMOD Int’l Conf. Management of Data (SIGMOD ’09), pp. 321-334, 2009.

[10] H. Cunningham, D. Maynard, K. Bontcheva, and V. Tablan, “GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications,” Proc. 40th Ann. Meeting of the ACL, 2002

Cites this article as

K. V. , M. B. V. B. R. , "Time Reduction Mechanism in Information Extraction Using Parse Tree Query Language", International Journal of Innovative Research in Computer Science and Technology (IJIRCST), Vol-2, no.5, pp.17-21, 2014. Available from:

Corresponding Author

K. Venkatesh

B.Tech degree from the department of Computer Science and Engineering from Sree Vidyanikethan Engineering College of Engineering, A.Rangampet, Tirupathi(Affiliated to JNTU Ananthapuramu). He is pursuing M.Tech from the department of Computer Science and Engineering in Shri Shirdi Sai Institute of Science and Engineering, Vadiyampeta, Ananthapuramu (Affiliated to JNTUAnanthapuramu). His current research interests include â€œTime Reduction Mechanism in Information Extraction Using PTQLâ€.

Download Full Paper

Download PDF

No. of Downloads: 8 | No. of Views: 1208

A Comparative Study of ChatGPT, Gemini, and Perplexity

Manali Shukla, Ishika Goyal, Bhavya Gupta, Jhanvi Sharma.

July 2024 - Vol 12, Issue 4
Helmet Detection and Number Plate Recognition Using YOLOv8 and Tensorflow Algorithm in Machine Learning

Dipti Prajapati, Samishtarani Sabat, Sanika Bhilare, Rashmi Vishe, Prof. Suman Bhujbal.

March 2024 - Vol 12, Issue 2
Machine Learning Prospects: Insights for Social Media Data Mining and Analytics

Anu Sharma, Vivek Kumar.

May 2023 - Vol 11, Issue 3

IJIRCST

Time Reduction Mechanism in Information Extraction Using Parse Tree Query Language

Citations

Download Full Paper PDF

Total View 1208

Total Download 8