TITLE:
Exploring the Big Data Using a Rigorous and Quantitative Causality Analysis
AUTHORS:
X. San Liang
KEYWORDS:
Causality, Big Data, Information Flow, Time Series, Causal Network
JOURNAL NAME:
Journal of Computer and Communications,
Vol.4 No.5,
May
26,
2016
ABSTRACT:
Causal analysis is
a powerful tool to unravel the data complexity and hence provide clues to
achieving, say, better platform design, efficient interoperability and service
management, etc. Data science will surely benefit from the advancement in this
field. Here we introduce into this community a recent finding in physics on
causality and the subsequent rigorous and quantitative causality analysis. The
resulting formula is concise in form, involving only the common statistics
namely sample covariance. A corollary is that causation implies correlation,
but not vice versa, resolving the
long-standing philosophical debate over correlation versus causation. The
applicability to big data analysis is validated with time series purportedly
generated with hidden processes. As a demonstration, a preliminary application
to the gross domestic product (GDP) data of United States, China, and Japan reveals
some subtle USA-China-Japan relations in certain periods.