Regression Analysis

Regression analysis is a statistical method for modeling relationships between different variables (dependent and independent). It is used to describe and analyze relationships in data. Predictions can also be made using regression analyses, whereby the relationships in the data would be used as a basis for the forecast and generated by a prediction model. Regression and correlation analyses are considered to be part of multivariate analytical methods and are used in many different areas, including science, statistics, finance, and now also online marketing, in order to analyze and partly predict the costs and turnover of products, campaigns, channels, and advertising media.

General information

Regression is undoubtedly no new topic. The associated mathematical instruments have already been used to determine the planetary orbits with data from astronomical observations. The method of the least squares was published by Carl Friedrich Gauss in 1809, after Adrien-Marie Legendre and other mathematicians created the theoretical foundations. This method is considered a precursor for regression analysis. The instruments were further developed and first used in biology and geology. Regression procedures continue to be a research area that involves many different scientists.

How it works

A regression is based on the idea that a dependent variable is determined by one or more independent variables. Assuming that there is a causal relationship between the two variables, the value of the independent variable affects the value of the dependent variable. For example, if you wanted to find out how your advertising investments impact sales, a regression analysis would be used to examine the relationship between the investments and the sales. If this relationship is clearly represented, it can serve as a prediction.[1] Regression analyses have two central objectives. They are supposed to:

  • Quantify relationships and describe them using measured values ​​and their graphical representation.
  • Provide forecasts and predictions.

Overview of various regression analyses:

  • Simple regression: Only one explanatory variable is used to explain the dependent variable.
  • Multiple regression: Several explanatory variables are related to a dependent variable.
  • Linear regression: There is a linear relationship between several explanatory variables and several dependent variables. The concept also included parameters which are linear and produce a structure.
  • Nonlinear regression: If there are no linear relationships between dependent and independent variables, you get non-linear regression. These models can be very complex because the relationships between the variables cannot be mapped using simple mathematical methods.

Although different regression methods exist, the structure of these methods is often similar in terms of steps:

  • Preparation of the data: In order to investigate developments and tendencies of variables, the data situation must be as complete and exact as possible. Rough calculations and plausibility checks are performed to check the data. If records are missing, missing-data techniques can be used, which is also called imputation in statistics. If the data and its relationships is to be displayed graphically, this can be taken into account during the preparation. Some regression models require very special data formats, into which they first have to be converted. This is the case, for example, in linear regression where a linear relationship between two variables is assumed.
  • Adaptation of the model: Each regression model works with statistical error corrections in order to deal with possible deviations. The functions that reduce the deviations are sometimes determined by the model. Therefore, a linear function is used in linear regression to handle the deviations. Error values ​​and approximations are calculated and integrated into the regression model.
  • Validation of the used model: We now examine whether the regression model describes the relationship between independent and dependent variables and how good this description is. Statisticians have different procedures and approaches to check the validity of the regression analysis used. For example, particularly influential data nodes are analyzed, which affect the context of the variables. Finally, a function should describe this relationship. Whether the function fits, has to be established with the regression procedure.
  • Forecast of values: If the model adequately describes the relationship, it can be used for prediction purposes. Again, accuracy plays a central role. Any inaccuracies in the forecasts are calculated or estimated. Any statements that go beyond the actual datasets are called extrapolation. Forecasts within the datasets are referred to as interpolation. The latter is less problematic than extrapolation. The assumptions made in this case have to be checked carefully.

Decisive for the benefit of a regression analysis is the extent to which the model describes the actual data and its possible relationships. An important problem is the choice of a model and along with it, the selection of the explanatory variables. Only significant correlations should be investigated. Therefore, each regression analysis includes different approaches for increasing accuracy, minimizing errors, and excluding statistical outliers that are not relevant to the investigated object. For these reasons, these models are often compared based on the key figures such as the coefficient of determination or, more generally, the information criterion.

Relevance to online marketing

Regression analyzes are used in online marketing, for example, to understand the customer journey using web analytics data or to support multi-channel marketing with reliable data. In practice, such analyses are complex and require professional know-how and knowledge. But the results can be very clear and tangible, depending on the model. For example, if attribution modeling is used for checking multiple channels like direct sales, display ads, affiliates, social media, email or referrals, regression analyzes can clearly show which of these channels have a good balance between investments and sales. At corporate levels and with specific partners who can realize such analyses, the results are likely to be extremely helpful and could significantly increase the ROI of individual digital assets.[2]


  1. TECHNIQUE #9: Regression Analysis Retrieved on November 15, 2016
  2. How To Use Regression Analysis To Estimate Incremental Revenue Opportunities Retrieved on November 15, 2016