Subscribe to DSC Newsletter

Interesting comparison table and comments, regarding the following statistical packages: R, MATLAB, SAS, STATA and SPSS. I wish Statistica would be included. The table tells you which statistical methods are available in each package. The list of statistical methods is itself impressive. Note that Jackknife (a resampling method in the table below) has nothing to do with Jackknife regression (new technique not implemented in any package yet, though Dr. Granville has promised to provide the source code).

Also statistical libraries are available in most programming languages, for instance Pandas in Python. Here are five interesting articles:

The table (below) and additional information about the various packages can be found here.

 TYPE OF STATISTICAL ANALYSIS  MATLAB SAS  STATA   SPSS
           
 Nonparametric Tests  Yes  Yes  Yes  Yes  Yes
 T-test  Yes  Yes  Yes  Yes  Yes
 ANOVA & MANOVA  Yes  Yes  Yes  Yes  Yes
 ANCOVA & MANCOVA  Yes  Yes  Yes  Yes  Yes
 Linear Regression  Yes  Yes  Yes  Yes  Yes
 Generalized Least Squares  Yes  Yes  Yes   Yes  Yes 
 Ridge Regression  Yes  Yes  Yes     
 Lasso  Yes  Yes  Yes     
 Generalized Linear Models  Yes  Yes  Yes  Yes  Yes
 Mixed Effects Models  Yes  Yes  Yes  Yes  Yes
 Logistic Regression  Yes  Yes  Yes  Yes  Yes
 Nonlinear Regression  Yes  Yes  Yes     
 Discriminant Analysis  Yes  Yes  Yes   Yes   Yes 
 Nearest Neighbor  Yes  Yes  Yes     Yes 
 Factor & Principal Components Analysis  Yes  Yes  Yes  Yes  Yes
 Copula Models  Yes  Yes  Experimental    
 Cross-Validation  Yes  Yes  Yes     
 Bayesian Statistics  Yes  Yes  Limited    
 Monte Carlo, Classic Methods  Yes  Yes  Yes   Yes   Limited
 Markov Chain Monte Carlo  Yes  Yes  Yes     
 Bootstrap & Jackknife  Yes  Yes  Yes   Yes 
 EM Algorithm  Yes  Yes  Yes     
 Missing Data Imputation  Yes  Yes  Yes   Yes   Yes 
 Outlier Diagnostics  Yes  Yes  Yes   Yes   Yes
 Robust Estimation  Yes  Yes  Yes   Yes 
 Longitudinal (Panel) Data  Yes  Yes  Yes   Yes   Limited
 Survival Analysis  Yes  Yes  Yes   Yes   Yes 
 Path Analysis  Yes  Yes  Yes     
 Propensity Score Matching  Yes  Yes  Limited   Limited   
 Stratified Samples (Survey Data)  Yes  Yes  Yes   Yes   Yes 
 Experimental Design  Yes  Yes      
 Quality Control  Yes  Yes    Yes   Yes 
 Reliability Theory  Yes  Yes  Yes   Yes   Yes
 Univariate Time Series  Yes  Yes  Yes   Yes   Limited
 Multivariate Time Series  Yes  Yes  Yes   Yes   
 Markov Chains  Yes  Yes      
 Hidden Markov Models  Yes  Yes      
 Stochastic Volatility Models  Yes  Yes  Limited  Limited   Limited 
 Diffusions  Yes  Yes      
 Counting Processes  Yes  Yes  Yes     
 Filtering  Yes  Yes  Limited   Limited  
 Instrumental Variables  Yes  Yes  Yes  Yes 
 Simultaneous Equations  Yes  Yes  Yes   Yes 
 Splines  Yes  Yes  Yes   Yes  
 Nonparametric Smoothing Methods  Yes  Yes  Yes   Yes   
 Extreme Value Theory  Yes  Yes      
 Variance Stabilization  Yes  Yes      
 Cluster Analysis  Yes  Yes  Yes   Yes   Yes 
 Neural Networks  Yes  Yes  Yes     Limited
 Classification & Regression Trees  Yes  Yes  Yes     Limited
 Boosting Classification & Regression Trees  Yes  Yes      
 Random Forests  Yes  Yes      
 Support Vector Machines  Yes  Yes  Yes    
 Signal Processing  Yes  Yes      
 Wavelet Analysis  Yes  Yes  Yes    
 ROC Curves  Yes  Yes  Yes   Yes   Yes 
 Optimization  Yes  Yes  Yes   Limited  

Views: 13824

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Sergey on August 26, 2014 at 8:20pm

Dear Dr. Granville, the Stata / SPSS implementation of non-linear regression is quite inflexible. The SPSS implementation is especially bad, allowing one to type only simple, non-recursive functions in a pop-up window. If you got a project about implementing a non-linear regression for a complex functional form, you would use R, Matlab or a similar programming language. Following the general vibe of responses, I changed the “Non-linear Regression / SPSS” field to “Limited” to avoid potential misinterpretations of the table. However, the truth is: the SPSS implementation of non-linear regression is unsatisfactory for most industry-level research.

Lasso is available in SPSS only as part of categorical regression, which does not cover linear regression and generalized linear models. So in 90% of real-life situations lasso is not there… Regarding AMOS, it is not part of the standard SPSS license and IBM is charging extra money for it. But I can see how one can make an argument that buying SPSS and AMOS together is still cheaper than buying the standard SAS portfolio. So I have updated the web-site accordingly. Thank you for your comments.

On a different note, I have updated the table recently to include recent advances, like attempts of the SAS Institute to address Boosting and Random Forests…. Any further feedback is appreciated.

Comment by Jeremy Benson on August 23, 2014 at 6:28pm

R does everything and is free.  I have used Matlab, SAS and SPSS.  Matlab and SAS are very good, but the biggest problem is that I can't install a version on my home computer freely.  Also, if I want something beyond the license that my company has purchased, then I have to go through a process to build a business case to get that "package".  If I want that for R, I just go to CRAN and download it.  The costs for R have mainly been books to learn R.  However, there are so many free resources on R that you can learn to do it without buying anything.

Comment by Chandrasekhara S. ("C.S.") Ganti on August 18, 2014 at 2:37am

Hello Vince G., 

Yes, it a well taken response, since it was a comparison of various software as remarked,

Yes, it is tricky comparison-- I must admit. I am not  heavy IT / CS but use all software to my advantage and in the proper context and use it for a good application -- proven over a long haul -- On just another note that  old style statisticians are die-hard. Thanks to all FRS and ASA  and ISI  pioneers.. from Fischer to Box , From H. Cramer to Sir C.R/ Rao, from Shewhart to Deming, their contributions are invaluable -- Big Data or not .. withstanding. 

speaking on Gartner -- I religious follow theirs and see latest comparison  for the BASEL compliance areas, I am looking forward to your take on that quadrant posting of various Software vendors. 

Thanks for your time, 

 **I just merely observed. In fact, I have not been in SAS for a bit.. Thx.

Comment by Vincent Granville on August 17, 2014 at 6:56pm

Hi Chandrasekhara - I've been using SAS for many years and it satisfied my needs. It does not matter the number of functions a package offers, only whether it offers what you are likely to use, and if it does it well. Here's an article on how to select a statistical package.

On a higher level, producing software reviews and comparisons is very tricky. Your reviews get outdated very quickly, and easily invite heavy criticism. Gartner sometimes provide interesting comparisons. I'm glad we have members here filling the gaps found in these reports. 

Comment by Mark Samuel Tuttle on August 17, 2014 at 3:42pm

Mathematica should be included here.  Yes, I know it's proprietary, but one reason I pay for it (and use it) is the high degree of integration - a kind of almost everything is a "first-class object" notion.  This makes using the statistical stuff they have easier and more productive.  Regardless, I like seeing attempts at package evaluations.  A future evaluation would be include some notion of scalability.  I find myself doing ever bigger problems on my laptop; one reason is that my current laptop is much more powerful than past versions.

Comment by Chandrasekhara S. ("C.S.") Ganti on August 17, 2014 at 12:50pm

Please let us know if SAS does not have Experimental Designs  and Quality Control .. Is not .JMP a part of SAS ?? Please confirm .. Thanks 

Comment by Chandrasekhara S. ("C.S.") Ganti on August 17, 2014 at 12:46pm

THE above table implies (???) SAS does not have non-parametric Tests. Please confirm if the YES is off the column to the right ?? 

Comment by Vincent Granville on August 15, 2014 at 10:10am

A reader said that this table has many errors about SPSS: Just starting at the top, contrary to what is indicated, SPSS has - and has had for years - ridge regression, lasso, nonlinear regression, path analysis (Amos) and more.

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2017   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service