Exploratory Data Analysis
for Large and Complex Data
Bruce Ratner, Ph.D.
Typically, the data analyst approaches a problem directly with an "inflexible" designed procedure specifically for that purpose. For example, the statistical problem of prediction of a continuous target variable (e.g., sale or profit) is solved by the "old" standard ordinary least-squares (OLS) regression model. This is in stark contrast to the newer machine learning approach that is a "flexible" nonparametric, assumption-free procedure that lets the data define the form of the model itself. The working assumption that today’s large and complex data fit the OLS model – which was formulated within the small-data setting of the day over 200 years ago – is not tenable. A flexible, any-size data model that is self-defining clearly offers a potential for building a reliable, highly predictive model, which was unimaginable two centuries ago.
The purpose of this article is to present the GenIQ Model©, a flexible, any-size data method (with unique scalability) that lets the data, exclusive of anything else, define the model. Specifically, the GenIQ Model automatically and simultaneously performs the trinity of analysis and modeling techniques: selecting important original variables, finding patterns within the data by constructing new important variables from the original variables (also known as Exploratory Data Analysis), and formulating a mathematical equation (model) based on the best set of original and constructed variables. GenIQ is based on the machine learning genetic paradigm inspired by Darwin’s Principle of Survival of the Fittest. It offers a clear advantage over current statistical methods, whose performance is dependent upon theoretical assumptions, predefined model formulations, and data-type restrictions. Moreover, GenIQ offers both a time-advantage and an intelligence-advantage over regression-based methods, as the latter require human intervention to perform the trinity of techniques. For an eye-opening preview of the 9-step modeling process of GenIQ, click here. For FAQs about GenIQ, click here.
1 800 DM STAT-1, or e-mail at email@example.com.