Subscribe to DSC Newsletter

Regression (LR and MLR) and differences, not for the Economy. Professional analyst should be able to answer these three questions.

To produce a regression analysis of inference that can be justified or trustworthy in the sense that helpful. The term in the statistical methods that generate a linear the best estimator is not bias (best linear unbiased estimator) abbreviated BLUE. Then there are some other things that are also important to note, in which the data to be processed, must meet certain requirements. In terms of statistical methods some terms or conditions of the so-called classical assumption test. Because they meet the assumptions of classical statistical coefficient will be obtained which actually became estimator of parameters that can be justified or accurate, among others:

  1. Must meet the assumptions of single colinearity, meaning between independent variables with each independent variable others in the regression model no multicollinearity, is a condition where there is a linear relationship was perfect or near perfect between the independent variables.
  2. Must meet homoscedasticity assumptions, it means a state where the variance the existing data on every variable must be the same (constant). In the event of deviation from this phase, mean regression model are heteroscedasticity.
  3. Must meet homogeneity assumptions, it means a state where the sample data should be derived or obtained from a population with a range or variance of the same.
  4. Does not contain autocorrelation or serial correlation, the correlation between data samples are arranged in order of time, for example in the form of time series data. This means that there is no influence between the variables included in the model through the grace period (time lag). Where, deviations occur when it is known that the value of the current variable will affect the value of other variables in the future.
  5. The independent variables in the model must have a constant value in each experiment carried out repeatedly, meaning that in the independent variable does not contain a correlation with an error rate in each of the observations made.
  6. Error to be normally distributed, ie where confounding variables has a distribution or a normal distribution, it is for the validity, stationeritas, the reliability of the data in the available variables.
  7. Must meet the assumptions of linearity, which is to see whether the specification is a linear regression model is correct or not, so if convinced that the linear regression model is the best model, it is necessary to test the linearity of advance.

All terms or phases of the classical assumptions that must be met, in order to build a regression model that could be accounted for. Thus, the need to test that assumption is intended to meet some of the elements of the accuracy of the parameter estimator is not biased to reflect the efficient level of analysis results are consistent so that the regression equation can be trusted.

But what is the problem ?

That in the classical statistical assumptions are considered to have fulfilled just because what counted was to find a causal relationship between the independent variables affect the dependent variable. Whereas in the economy, the assumption is certainly not applicable, because the economic variables must have each other's behavior that allows one to violate these assumptions.
It can be concluded that the assumptions are considered correct in the statistics need to re-examine, in the sense of doing the reprocessing data that exist, such as the increase or decrease of data, combining data, change data in a particular form (differential and integral) and other. It can be called as well as the manipulation of data with the intention of transforming the regression model for the later expected to meet the classical assumptions. For example, to meet the assumption of single linearity (collinearity) if a regression model has a double colinearity (multicollinearity) it is necessary to find a way to correct these deviations settlement. More on how to tackle the problem of double colinearity, there is some way to addressing the problem of the multicollinearity, among others:

  1. Checking theoretically whether between independent variables there is a connection. This relates to how to find supporters of the theory through the study of literature in selecting independent variables.
  2. Doing merger between places or cascading series data space (cross-section) and time series that can be referred to as the polling data.
  3. Remove one of the variables of the model.
  4. Transform the existing variables in the model.
  5. Adding new data, namely by increasing the number of observations

More on ways to tackle the problem of the fulfillment of these classical assumptions. It is known that in fact these problems arise because of the certain things. For example on the assumption that a regression model may not contain autocorrelation, where the occurrence of autocorrelation in fact caused by several things. The cause of autocorrelation, among others:

  1. Inaction, this occurs especially in the nature of time series data. That is not change the economic situation is usually not immediately occurring. For example, when the BI rate experienced a rate of change of the other banks need to make adjustments for at least three months running.
  2. Specifications bias, in which a regression model with certain reasons do not include one or a couple of variables, but these variables are relevant may cause autocorrelation. Such models are specified bias. So that an unknown variable although autocorrelation result, must remain inserted into the model, so as not to bias (unbiased).
  3. One determines the shape function, autocorrelation arise due to errors in determining the function, which cause nuisance autocorrelation in error. For example, should the model expressed in the function is not linear, but is expressed in linieir function.
  4. The influence of the pause time (time lag), it is actually related to the first cause that inaction. That is, if known, turns the dependent variable is not only influenced by the independent variable, but is also influenced by the dependent variable in the previous period can lead autocorrelation. For example, the amount of exports is not only influenced by inflation in the period, but also by exports in the previous period.


If concluded in fact still found some problems, related in terms of examining the economy and the use of methods of analysis. Broadly speaking the proficiency level in these issues, among others:

  • Risk of uncertainty economy is believed to still be perceived by the public, such as entrepreneurs, investors and other business people. In addition, the risks of economic uncertainty is also believed by the government, can be a barrier in achieving economic goals. Wherein, the government should solve the economic problems with decrees and regulations of unilateral or democratically by the legislative process.
  • Information is data on the economic factors available from the real world it is still not used optimally in the case to support the achievement of economic goals. For methods that exist today have not been able to make optimum use of information. For example, it is known theoretically for multiple regression analysis is considered no longer effective, if in models include more than seven independent variables.
  • Theoretically, that the relations in economic theory that ignores the effect of random variables or can be interpreted only deterministic. That in economic theory beyond the influence of the variables included in the analysis were considered constant. So the analysis targets a general nature only.
  • Regression analysis as a method suggested by economists and econometricians, it is still not maximized. This is due to the existence of certain conditions (the classic assumption) that must be met. Meanwhile, according to experts, these requirements may not be met, but also can not be ignored.

With adjustments being an attempt to fulfill certain requirements (classical assumption) in the regression analysis as a form of simplification in the application of modern economics, which is a form of empirical science. It turned out to be ignoring important and fundamental. Namely, with the change in the price or value of the real, the observation result of these adjustments. So the ability to maintain the condition of all-an empirical regression analysis so dubious.

Based on a description of the problem, it's the next point, which is about questions that arise:

  • Is there a tool or a method of analysis that is able to eliminate or at least minimize the risk of uncertainty in the economy ?
  • Is there a tool or a method of analysis that is able to optimize the use of information in the form of data from these economic factors, of course to support the achievement of economic goals ?
  • Is there a tool or a new analytical method that is able to address the shortcomings of existing methods of analysis ?

The analyst should be a problem-solvers not be troublemakers.

Views: 12631


You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Paul Tulloch on May 7, 2017 at 6:11pm

another I would add onto this is when doing a regression from a sample of a population- which again takes knowledge and training

Comment by Dalila Benachenhou on November 19, 2016 at 4:10am

When so many predictive models exists, one is obliged to find the best model for the given dataset.  

Of course, you can force linear regression to model non-linear models through may be piecewise linear regression, but why? when other more appropriate models exist.

Comment by Thomas Lincoln on November 18, 2016 at 11:23am

For those interested in understanding the finance and economic models don't scale well, and are not linear, please read this article by a PhD model maker

Comment by Dalila Benachenhou on November 17, 2016 at 4:29pm

To Thomas Lincoln, I do like your comment.

To Jeffry:  If you cannot find a transform so that your residuals are normally distributed use Non-linear approach, or a non-parametric approach.

Most  social science problems cannot be easily solved using linear model or parametric approaches.

If you have missing variables, and you cannot get a viable model, then rethink the problem.  You probably have the wrong assumptions about the potential predictors needed to predict the response variable.

In many cases, a matrix scatterplot can help know if linearity is possible between predictors and response variables.  For instance, if you have a scatterplot between a predictor and the response variable that looks like an widening and rounding  tunnel, you can just use a log function on the predictor to get a linear relation between the predictor and the response variable.

Comment by Thomas Lincoln on November 14, 2016 at 5:16pm

Economics is the study of the behavior of humans handling money. That makes serial or auto correlation impossible to ignore.  Human decisions are flawed, not linear, and more behavioral than scientific. Likewise, economic activity does not scale linearly indefinitely.  Herding of local behaviors within a business span of control may cause the global economic growth to decline.  Outsourcing as much offshore for a single company makes sense; if all companies outsource offshore, there will not be any jobs to support the economy because employees are the consumers as well and without employees, there are no customers. . .  something that economics in business conveniently ignores.   The problem is that economic aggregate data series does not capture the information needed to realize that the elites have the power to kill the economy, without realizing that they are doing it together when they thought each was doing proper corporatism.   I have to stop now. . .

Follow Us


  • Add Videos
  • View All


© 2018   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service