A Novel Method for Transforming XML Documents to Time Series and Clustering Them Based on Delaunay Triangulation

HTML  XML Download Download as PDF (Size: 932KB)  PP. 1076-1085  
DOI: 10.4236/am.2015.66098    2,714 Downloads   3,346 Views  Citations
Author(s)

ABSTRACT

Nowadays exchanging data in XML format become more popular and have widespread application because of simple maintenance and transferring nature of XML documents. So, accelerating search within such a document ensures search engine’s efficiency. In this paper, we propose a technique for detecting the similarity in the structure of XML documents; in the following, we would cluster this document with Delaunay Triangulation method. The technique is based on the idea of representing the structure of an XML document as a time series in which each occurrence of a tag corresponds to a given impulse. So we could use Discrete Fourier Transform as a simple method to analyze these signals in frequency domain and make similarity matrices through a kind of distance measurement, in order to group them into clusters. We exploited Delaunay Triangulation as a clustering method to cluster the d-dimension points of XML documents. The results show a significant efficiency and accuracy in front of common methods.

Share and Cite:

Shafieian, N. (2015) A Novel Method for Transforming XML Documents to Time Series and Clustering Them Based on Delaunay Triangulation. Applied Mathematics, 6, 1076-1085. doi: 10.4236/am.2015.66098.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.