TITLE:
Optimizing Query Results Integration Process Using an Extended Fuzzy C-Means Algorithm
AUTHORS:
Naoual Mouhni, Abderrafiaa Elkalay, Mohamed Chakraoui
KEYWORDS:
Clustering, Classification and Association Rules, Database Integration, Data Warehouse and Repository, Heterogeneous Databases, Query Processing
JOURNAL NAME:
Journal of Software Engineering and Applications,
Vol.7 No.5,
May
6,
2014
ABSTRACT:
Cleaning duplicate data is
a major problem that persists even though many works have been done to solve
it, due to the exponential growth of data amount treated and the necessity to
use scalable and speed algorithms. This problem depends on the type and quality
of data, and differs according to the volume of data set manipulated. In this
paper we are going to introduce a novel framework based on extended fuzzy
C-means algorithm by using topic ontology. This work aims to improve the OLAP
querying process over heterogeneous data warehouses that contain big data sets,
by improving query results integration, eliminating redundancies by using the
extended classification algorithm, and measuring the loss of information.