Comments - 10 types of regressions. Which one to use? - Data Science Central2019-10-20T09:03:03Zhttps://www.datasciencecentral.com/profiles/comment/feed?attachedTo=6448529%3ABlogPost%3A186758&xn_auth=noNice thumbnail outline. FYI,…tag:www.datasciencecentral.com,2016-06-02:6448529:Comment:4318852016-06-02T15:03:58.390ZAlan Dunhamhttps://www.datasciencecentral.com/profile/AlanDunham
<p>Nice thumbnail outline. FYI, the term 'jackknife' also was used by Bottenberg and Ward, <span style="text-decoration: underline;">Applied Multiple Linear Regression</span>, in the '60s and 70's, but in the context of segmenting. As mentioned by <a class="fn url" href="http://www.datasciencecentral.com/profile/KalyanaramanK">Kalyanaraman</a> in this thread, econometrics offers other approaches to addressing multicollinearity, autocorrelation in time series data, solving simultaneous…</p>
<p>Nice thumbnail outline. FYI, the term 'jackknife' also was used by Bottenberg and Ward, <span style="text-decoration: underline;">Applied Multiple Linear Regression</span>, in the '60s and 70's, but in the context of segmenting. As mentioned by <a href="http://www.datasciencecentral.com/profile/KalyanaramanK" class="fn url">Kalyanaraman</a> in this thread, econometrics offers other approaches to addressing multicollinearity, autocorrelation in time series data, solving simultaneous equation systems, heteroskedasticity, and over- and under-identification.</p> I'm puzzled why there isn't m…tag:www.datasciencecentral.com,2016-01-10:6448529:Comment:3715822016-01-10T03:07:21.786ZJamie Lawsonhttps://www.datasciencecentral.com/profile/JamieLawson
<p><span>I'm puzzled why there isn't more attention here to the underlying model. If you have strong reason to believe that the underlying model is linear, then linear regression is fine. If you have strong reason to believe it's sigmoidal, then linear regression is an unlikely candidate. What it usually boils down to, in my experience, is defining the model, and defining the norm. Answers to those two questions pretty much define the problem that you are solving, and given that, there is a…</span></p>
<p><span>I'm puzzled why there isn't more attention here to the underlying model. If you have strong reason to believe that the underlying model is linear, then linear regression is fine. If you have strong reason to believe it's sigmoidal, then linear regression is an unlikely candidate. What it usually boils down to, in my experience, is defining the model, and defining the norm. Answers to those two questions pretty much define the problem that you are solving, and given that, there is a (usually) unique solution. It is frustrating to me when I see people typing stuff in at the keyboard but they don't have a solid description of the problem they are solving. Once you have that problem definition, the specific method of solution is often pretty clear.</span></p> Using python's diabetes datas…tag:www.datasciencecentral.com,2015-04-07:6448529:Comment:2652882015-04-07T18:48:59.816Zimry kissoshttps://www.datasciencecentral.com/profile/imrykissos
<p>Using python's diabetes dataset I created a visualization to show the Support Vector position in SVR:</p>
<p><a href="http://storage.ning.com/topology/rest/1.0/file/get/2773307767?profile=original" target="_self"><img class="align-full" src="http://storage.ning.com/topology/rest/1.0/file/get/2773307767?profile=original" width="541"></img></a></p>
<p></p>
<p>I also created a visualization of different regression methods on the same data set, using non optimized hyper-parameters…</p>
<p><a href="http://storage.ning.com/topology/rest/1.0/file/get/2773307825?profile=original" target="_self"><img class="align-full" src="http://storage.ning.com/topology/rest/1.0/file/get/2773307825?profile=original" width="593"></img></a></p>
<p>Using python's diabetes dataset I created a visualization to show the Support Vector position in SVR:</p>
<p><a href="http://storage.ning.com/topology/rest/1.0/file/get/2773307767?profile=original" target="_self"><img src="http://storage.ning.com/topology/rest/1.0/file/get/2773307767?profile=original" width="541" class="align-full"/></a></p>
<p></p>
<p>I also created a visualization of different regression methods on the same data set, using non optimized hyper-parameters</p>
<p><a href="http://storage.ning.com/topology/rest/1.0/file/get/2773307825?profile=original" target="_self"><img src="http://storage.ning.com/topology/rest/1.0/file/get/2773307825?profile=original" width="593" class="align-full"/></a><br/><br/></p> Another type of regression th…tag:www.datasciencecentral.com,2015-04-07:6448529:Comment:2651962015-04-07T18:24:58.434Zimry kissoshttps://www.datasciencecentral.com/profile/imrykissos
<p>Another type of regression that I find very useful is <strong>Support Vector Regression</strong>, proposed by Vapnik, coming in two flavors:</p>
<p>SVR - (python - sklearn.svm.SVR) - regression<span> depends only on support vectors from the training data. The cost function for building the model ignores any training data epsilon-close to the model prediction. </span><span><br></br></span></p>
<p>NuSVR - (python - sklearn.svm.NuSVR), enabling to limit the number of support vectors used by the…</p>
<p>Another type of regression that I find very useful is <strong>Support Vector Regression</strong>, proposed by Vapnik, coming in two flavors:</p>
<p>SVR - (python - sklearn.svm.SVR) - regression<span> depends only on support vectors from the training data. The cost function for building the model ignores any training data epsilon-close to the model prediction. </span><span><br/></span></p>
<p>NuSVR - (python - sklearn.svm.NuSVR), enabling to limit the number of support vectors used by the SVR.<span style="font-size: 2em;"><br/></span></p>
<p>As in support vector classification, in SVR different kernels can be used in order to build more complex models using the kernel trick.</p>
<p><span style="font-size: 2em;"> </span></p>
<p><span style="font-size: 2em;"> </span></p> What are folks thoughts on MA…tag:www.datasciencecentral.com,2014-08-01:6448529:Comment:1897412014-08-01T04:42:53.371ZJ.T. Radmanhttps://www.datasciencecentral.com/profile/JTRadman
<p>What are folks thoughts on MARS (Multivariable Adaptive Regression Spines) as far as regression techniques? R: earth. Python: py-earth Salford Systems own the MARS implementation.</p>
<p><a href="http://www.slideshare.net/salfordsystems/evolution-of-regression-ols-to-gps-to-mars" target="_blank">http://www.slideshare.net/salfordsystems/evolution-of-regression-ols-to-gps-to-mars</a></p>
<p></p>
<p>What are folks thoughts on MARS (Multivariable Adaptive Regression Spines) as far as regression techniques? R: earth. Python: py-earth Salford Systems own the MARS implementation.</p>
<p><a href="http://www.slideshare.net/salfordsystems/evolution-of-regression-ols-to-gps-to-mars" target="_blank">http://www.slideshare.net/salfordsystems/evolution-of-regression-ols-to-gps-to-mars</a></p>
<p></p> I'd love to see a case study,…tag:www.datasciencecentral.com,2014-07-30:6448529:Comment:1894042014-07-30T17:48:35.608ZIga Kornetahttps://www.datasciencecentral.com/profile/IgaKorneta
<p>I'd love to see a case study, to show how different methods provide different results.</p>
<p>I'd love to see a case study, to show how different methods provide different results.</p> About R implementations, here…tag:www.datasciencecentral.com,2014-07-24:6448529:Comment:1875412014-07-24T15:37:35.989ZVincent Granvillehttps://www.datasciencecentral.com/profile/VincentGranville
<p>About R implementations, here is a comment by Alan Parker (see also Amy's comment below):</p>
<p><em>The CRAN task view: “Robust statistical methods” gives a long list of regression methods, including many that Vincent mentions. Here a some that are not mentioned there: </em><br></br><br></br><em>Regression in unusual spaces. This subject is old. It is usually addressed under the title “Compositional data” (see Wikipedia entry). The late John Aitchison founded this area of statistics. Googling his…</em></p>
<p>About R implementations, here is a comment by Alan Parker (see also Amy's comment below):</p>
<p><em>The CRAN task view: “Robust statistical methods” gives a long list of regression methods, including many that Vincent mentions. Here a some that are not mentioned there: </em><br/><br/><em>Regression in unusual spaces. This subject is old. It is usually addressed under the title “Compositional data” (see Wikipedia entry). The late John Aitchison founded this area of statistics. Googling his name + “compositional data” gives access to a number of his articles. The R package “compositions” deals with it comprehensively. Another package treats the problem using robust statistics: “robCompositions”. </em><br/><br/><em>Bayesian regression. I find Bayesian stuff conceptually hard, so I am using John Kruschke’s friendly book: “Doing Bayesian data analysis”. Chapter 16 is on linear regression. He provides a free R package to carry out all the analyses in the book. The CRAN view “Bayesian” has many other suggestions. Package BMA does linear regression, but packages for Bayesian versions of many other types of regression are also mentioned. </em></p> Yes. ARIMA is one among the m…tag:www.datasciencecentral.com,2014-07-24:6448529:Comment:1874532014-07-24T14:00:43.114ZKalyanaraman Khttps://www.datasciencecentral.com/profile/KalyanaramanK
Yes. ARIMA is one among the models I considered.
Yes. ARIMA is one among the models I considered. I think what Kalyanaraman has…tag:www.datasciencecentral.com,2014-07-23:6448529:Comment:1870702014-07-23T17:09:25.437ZMirko Krivanekhttps://www.datasciencecentral.com/profile/MirkoKrivanek
<p>I think what Kalyanaraman has in mind is auto-regressive models for time series, like ARIMA processes and Box & Jenkins types of tools to estimate the parameters. A simple form is x(t) = a * x(t-1) + b * x(t-2) + error, where t is the time, a, b are the "regression" coefficients, and a, b are positive numbers satisfying a + b = 1 (otherwise the time series explodes).</p>
<p>I think what Kalyanaraman has in mind is auto-regressive models for time series, like ARIMA processes and Box & Jenkins types of tools to estimate the parameters. A simple form is x(t) = a * x(t-1) + b * x(t-2) + error, where t is the time, a, b are the "regression" coefficients, and a, b are positive numbers satisfying a + b = 1 (otherwise the time series explodes).</p> Hi Vincent
I was thinking abo…tag:www.datasciencecentral.com,2014-07-23:6448529:Comment:1871682014-07-23T14:05:59.901ZKalyanaraman Khttps://www.datasciencecentral.com/profile/KalyanaramanK
Hi Vincent<br />
I was thinking about the class of regressions where the data vary over time, say time series. You may know that Econometric Methods contain a lot of alternative versions of regressions depending upon the type of violation of basic assumptions of linear model. You are right when you say jackknife and transformations may take some of these issues but not all. Thus there are regressions with appropriate transformations to control heteroscedasticity ; regressions with AR(1) disturbances;…
Hi Vincent<br />
I was thinking about the class of regressions where the data vary over time, say time series. You may know that Econometric Methods contain a lot of alternative versions of regressions depending upon the type of violation of basic assumptions of linear model. You are right when you say jackknife and transformations may take some of these issues but not all. Thus there are regressions with appropriate transformations to control heteroscedasticity ; regressions with AR(1) disturbances; regressions with distributed lags or geometric lag structure of explanatory variables; regressions with lagged explained variables leading to partial adjustment and adaptive expectation model; regressions with stochastic regressors; regressions with error in measurement leading to regression with instrumental variables. Above all the problem of co-integrated models in regression. I was just adding to your list. <br />
Kalyanaraman