TITLE:
Non-Linear Matrix Completion
AUTHORS:
Fengrui Zhang, Randy C. Paffenroth, David Worth
KEYWORDS:
Matrix Completion, Data Pipeline, Machine Learning
JOURNAL NAME:
Journal of Data Analysis and Information Processing,
Vol.12 No.1,
February
29,
2024
ABSTRACT: Current methods for predicting missing values in datasets often rely on
simplistic approaches such as taking median value of attributes, limiting their
applicability. Real-world observations can be diverse, taking stock price as
example, ranging from prices post-IPO to values before a company’s collapse, or
instances where certain data points are missing due to stock suspension. In
this paper, we propose a novel approach using Nonlinear Matrix Completion
(NIMC) and Deep Matrix Completion (DIMC) to predict associations, and conduct
experiment on financial data between dates and stocks. Our method leverages various types of stock observations to
capture latent factors explaining the
observed date-stock associations. Notably, our approach is nonlinear, making it suitable for datasets with nonlinear structures, such as the Russell 3000.
Unlike traditional methods that may suffer from information loss, NIMC and DIMC
maintain nearly complete information, especially in high-dimensional
parameters. We compared our approach with state-of-the-art linear methods,
including Inductive Matrix Completion, Nonlinear Inductive Matrix Completion,
and Deep Inductive Matrix Completion. Our findings show that the nonlinear
matrix completion method is particularly effective for handling nonlinear
structured data, as exemplified by the Russell 3000. Additionally, we validate
the information loss of the three methods across different dimensionalities.