IJIRCSTJournal International Journal of Innovative Research in Computer Science and Technology I S

Volume 3 Issue 3

Computer Science English May - June 2015 N N Y 2019 12 02 Computer Sciences Detection of Similar Identities in XML Documents English Y 134 138 Miss Amita Fulsundar English Y Dr.K.V.Metre English N Duplicate detection is an important part of data cleaning; it is the process of detecting multiple representations of a same real-world object in the data sources. Numbers of solutions are available for detecting duplicates in XML data. One of the novel methods for XML duplicate detection is XMLDup. XMLDup makes use of a Bayesian network to evaluate the probability of two XML elements are duplicates. In addition a network pruning strategy is also used for improving the evaluation of the Bayesian network. A DOM tree construction algorithm for constructing the tree of the input XML data is proposed. It is seen that by using DOM tree construction algorithm higher efficiency is achieved for detection of similar identities in XML Documents. English Duplicate Detection, XML, DOM, Bayesian network, data cleaning. https://ijircst.org/abstract.php?article_id=215