International Journal of Innovative Research in Engineering and Management
Year: 2022, Volume: 10, Issue: 6
First page : ( 101) Last page : ( 109)
Online ISSN : 2350-0557.
DOI: 10.55524/ijircst.2022.10.6.18 |
DOI URL: https://doi.org/10.55524/ijircst.2022.10.6.18
Crossref
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0) (http://creativecommons.org/licenses/by/4.0)
Article Tools: Print the Abstract | Indexing metadata | How to cite item | Email this article | Post a Comment
Zaid Altaf , Ashish Oberoi
The purpose of text summarization is to quickly and accurately extract the most important data from papers. The proposed unsupervised method seeks to synthesise complete and informative bug reports (software artefacts). The suggested approach employs Rapid Auto- matic Keyword Extraction and the term frequency-inverse document frequency method to identify applicable keywords and phrases. During the sentence extraction procedure, fuzzy C-means clustering is used to prioritise sentences that have a high degree of membership in each cluster (beyond a predefined threshold). The selection of sentences is performed by a rule-engine. Information is extracted using keywords and sentences chosen by the clustering process, and the rules are developed using domain knowledge. The proposed method produces a logical and well-organized summary of apache bug reports. The retrieval summary is improved with the help of hierarchical clustering by removing unnecessary details and rearranging them. The Apache Project Bug Report Corpus (APBRC) and the original Bug Report Corpus are used to evaluate the effectiveness of the proposed method. Measures of performance such as precision, recall, pyramid precision, and F-score are used to evaluate the results. Experiment results demonstrate that our proposed method significantly outperforms the state-of-the-art baseline methods like BRC and LRCA. In addition, it achieves substantial gains compared to prior art unsupervised methods as Hurried and centroid. It extracts the most relevant keyword phrases and sentences from each cluster to offer comprehensive coverage and a coherent summary. The average values for precision, recall, f-score, and pyramid precision on the APBRC corpus are 78.22%, 82.18%, 80.10%, and 81.66%, respectively.
M. Tech Scholar, Department of Computer Science & Engineering, RIMT University, Mandi Gobindgarh, Punjab, India
No. of Downloads: 31 | No. of Views: 666
Manali Shukla, Ishika Goyal, Bhavya Gupta, Jhanvi Sharma.
July 2024 - Vol 12, Issue 4
Dipti Prajapati, Samishtarani Sabat, Sanika Bhilare, Rashmi Vishe, Prof. Suman Bhujbal.
March 2024 - Vol 12, Issue 2
Anu Sharma, Vivek Kumar.
May 2023 - Vol 11, Issue 3