price = -55089.98 + 87.34engineSize +…

Excel is often poorly regarded as a platform for regression analysis. The regression add-in in its Analysis Toolpak has not changed since it was introduced in 1995, and it was a flawed design even back then. (See **this link** for a discussion.) That’s unfortunate, because an Excel file can be a very good place in which to build regression models, compare and refine them, create…

Added by Robert Nau on July 21, 2019 at 7:00am — No Comments

Emerging applications like machine learning (ML), big data analytics, and artificial intelligence (AI) has created the need for many companies to hire highly skilled and experienced work force. Demand for data scientists, ML engineers and data engineers is booming and will only increase in the next years. The January report from Indeed, one of the top job sites, showed a 29% increase in demand for data scientists year over year and a 344% increase since 2013.

**Salaries and…**

Added by Chris Kachris on May 17, 2019 at 4:30am — No Comments

Machine learning applications require powerful and scalable computing systems that can sustain the high computation complexity of these applications. Companies that are working on the domain of machine learning have to allocate a significant amount of their budget for the OpEx of machine learning applications whether this is done on cloud or on-prem.

Typical machine learning application…

ContinueAdded by Chris Kachris on May 14, 2019 at 11:30pm — No Comments

Yes, I know, this has been tried a few times and no one listens.... At least not yet. Despite several studies showing otherwise, teams still punt more than they should. Admittedly, some of these studies have been less than rigorous, and often times, assumptions are made that warrant scrutiny (assuming a 50% success rate on all 4th down attempts for example). But I don't think it is the lack of scientific rigor that keeps change at bay. I think the failure to adopt a novel strategy has a lot…

ContinueAdded by Ray Hall on August 30, 2018 at 9:30am — No Comments

There are many good and sophisticated feature selection algorithms available in R. Feature selection refers to the machine learning case where we have a set of predictor variables for a given dependent variable, but we don’t know a-priori which predictors are most important and if a model can be improved by eliminating some predictors from a model. In linear regression, many students are taught to fit a data set to find the best model using so-called “least squares”. In most…

ContinueAdded by Blaine Bateman on April 30, 2018 at 7:30am — No Comments

The concepts of p-value and level of significance are vital components of hypothesis testing and advanced methods like regression. However, they can be a little tricky to understand, especially for beginners and good understanding of these concepts can go a long way in understanding advanced concepts in statistics and econometrics. Here, we try to simplify the concept in an easy, logical manner. Hope this helps.

**P-value**

In hypothesis testing, we set…

ContinueAdded by Smrati Sharma on November 29, 2017 at 4:30pm — 2 Comments

Research fields usually follow the practice of categorizing continuous predictor variables, and they are the same who mostly use ANOVA. They often do it through median splits, the high value above the median and the low values below the median. However; this it seems is not that good an idea, and enlisted are some of the reasons to it:

- Median tends to vary from sample to sample. This makes the categories in various samples have various…

Added by Chirag Shivalker on October 24, 2017 at 10:00pm — No Comments

In the last few blog posts of this series discussed regression models at length. Fernando has built a multivariate regression model. The model takes the following shape:

price = -55089.98 + 87.34engineSize +…

Added by Pradeep Menon on August 30, 2017 at 4:30am — No Comments

Added by Goran S. Milovanović on April 14, 2017 at 11:00pm — No Comments

Linear Regression is one of the most widely used statistical models. If Y is a continuous variable i.e. can take decimal values, and is expected to have linear relation with X's variables, this relation could be modeled as linear regression, mostly the **first** model to fit,if we are planning to develop a model of forecasting Y or trying to build hypothesis about relation Xs on Y.

The…

ContinueAdded by Jishnu Bhattacharya on February 1, 2017 at 8:30pm — No Comments

*This article was posted by Arpan Gupta (Indian Institute of Technology).*

Let’s learn from a precise demo on Fitting Logistic Regression on Titanic Data Set for Machine Learning

**Description**:On April 15, 1912, the Titanic sank after colliding with an iceberg, killing 1502 out of 2224 passengers and crew.…

Added by Emmanuelle Rieuf on January 16, 2017 at 11:00am — No Comments

Regressions are widely used to **estimate relations between variables or predict future values for a certain dataset**.

If you want to know how much of variable "x" interferes with…

ContinueAdded by Renata Ghisloti Duarte Souza Gra on December 27, 2016 at 10:00am — No Comments

**INTRODUCTION TO THE RESEARCH QUESTION**

The research was conducted to find out what price maximises profit without sacrificing the high demand for the product due to the price being too high nor sacrificing the margins on the product due to the price being too low.

The goal is to experiment with different price levels for the same product in one market place and country to see how sales volumes change with prices and which volume level of…

ContinueAdded by Bernard Antwi Adabankah on October 29, 2016 at 10:30pm — 3 Comments

Tensorflow is an open source machine learning (ML) library from Google. It has particularly became popular because of the support for Deep Learning. Apart from that it's highly scalable and can run on Android. The documentation is well maintained and several tutorials available for different expertise levels. To learn more about downloading and installing Tesnorflow, visit official website.

To scratch the surface of this incredible ML library,…

ContinueAdded by Aqib Saeed on July 7, 2016 at 12:00pm — No Comments

**[Previous Post]**

Single regression on Exxon's stock

**[Introduction of Multi-regression]**

Let's recall our last job. We conducted the single regression on Exxon Mobil's stock along with WTI crude oil spot price. The result was fantastic, which accounts for 25% of the variation of stock movement. Put it in other way, R-square. The problem is "are you happy with the…

Added by Gregory Choi on May 20, 2016 at 9:05am — 1 Comment

Regression is the first technique you’ll learn in most analytics books. It is a very useful and simple form of supervised learning used to predict a quantitative response.

*Originally published on Ideatory…*

Added by Sudhanshu Ahuja on March 28, 2016 at 8:00pm — No Comments

UPDATE: Mar 20, 2016 - Added my new follow-up course on Deep Learning, which covers ways to speed up and improve vanilla backpropagation: momentum and Nesterov momentum, adaptive learning rate algorithms like AdaGrad and RMSProp, utilizing the GPU on AWS EC2, and stochastic batch gradient descent. We look at TensorFlow and Theano starting from the basics - variables, functions, expressions, and simple optimizations - from there, building a neural network seems simple! …

ContinueAdded by LazyProgrammer.me on January 23, 2016 at 8:30pm — 2 Comments

This is the 2nd part of the series. Read the first part here: Logistic Regression Vs Decision Trees Vs SVM: Part I

In this part we’ll discuss how to choose between Logistic Regression , Decision Trees and Support Vector Machines. The most correct answer as mentioned in the …

ContinueAdded by Aatash Shah on November 19, 2015 at 1:00am — No Comments

Classification is one of the major problems that we solve while working on standard business problems across industries. In this article we’ll be discussing the major three of the many techniques used for the same, Logistic Regression, Decision Trees and Support Vector Machines [SVM].

All of the above listed algorithms are used in classification [ SVM and Decision Trees are also used for regression, but we are not discussing that today!]. Time and again I have seen people asking which…

ContinueAdded by Aatash Shah on November 19, 2015 at 12:31am — 1 Comment

Life scientists collect similar type of data on daily basis. Statistical analysis of this data is often performed using SAS programming techniques. Programming for each dataset is a time consuming job. The objective of this paper is to show how SAS programs are created for systematic analysis of raw data to develop a linear regression model for prediction. Then to show how PROC SQL can be used to replace several data steps in the code. Finally to show how SAS macros are created on these…

ContinueAdded by Venu Perla PhD on October 10, 2015 at 9:00am — No Comments

- Linear and logistic regression in Excel and R: try this free add-in (RegressIt)
- How to make ML engineers 5x more efficient
- How to save over $200k on your next machine learning project
- Another Analysis of Punting on 4th Down
- Simple automated feature selection using lm() in R
- p-value and level of significance explained
- When to Categorize Continuous Predictor in a Regression Model?

- Step-by-step video courses for Deep Learning and Machine Learning
- How to Lie with Visualizations: Statistics, Causation vs Correlation, and Intuition!
- Logistic Regression Vs Decision Trees Vs SVM: Part I
- Random Forests Algorithm
- How to forecast using Regression Analysis in R
- How to make ML engineers 5x more efficient
- Logistic Regression vs Decision Trees vs SVM: Part II

**2020**

**2019**

- December (112)
- November (126)
- October (123)
- September (110)
- August (96)
- July (123)
- June (122)
- May (137)
- April (120)
- March (122)
- February (111)
- January (116)

**2018**

- December (109)
- November (108)
- October (114)
- September (116)
- August (120)
- July (110)
- June (132)
- May (135)
- April (118)
- March (137)
- February (134)
- January (132)

**2017**

- December (110)
- November (152)
- October (199)
- September (152)
- August (234)
- July (159)
- June (186)
- May (165)
- April (175)
- March (207)
- February (152)
- January (168)

**2016**

- December (129)
- November (164)
- October (157)
- September (174)
- August (170)
- July (137)
- June (225)
- May (177)
- April (170)
- March (200)
- February (182)
- January (198)

**2015**

- December (231)
- November (295)
- October (245)
- September (239)
- August (178)
- July (154)
- June (154)
- May (143)
- April (168)
- March (126)
- February (134)
- January (128)

**2014**

- December (104)
- November (113)
- October (141)
- September (129)
- August (101)
- July (104)
- June (91)
- May (120)
- April (86)
- March (117)
- February (99)
- January (112)

**2013**

- December (90)
- November (93)
- October (113)
- September (83)
- August (77)
- July (68)
- June (57)
- May (59)
- April (44)
- March (51)
- February (41)
- January (61)

**2012**

- December (39)
- November (65)
- October (73)
- September (44)
- August (23)
- July (20)
- June (22)
- May (51)
- April (40)
- March (26)
- February (37)
- January (18)

**2011**

- December (58)

**1999**

- November (3)

© 2020 Data Science Central ® Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Upcoming DSC Webinar**

**Your Model Will Probably Fail (And How to Prevent it)**- July 9

Data science is more popular than ever, but many data scientists struggle with complicated workflows to run their models as well as how to best communicate the output to less technical stakeholders. Tableau can solve both of these challenges by designing R workflows and creating visualizations that break complicated models down into easily understandable stories.**Register today**.

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Upcoming DSC Webinar**

**Your Model Will Probably Fail (And How to Prevent it)**- July 9

Data science is more popular than ever, but many data scientists struggle with complicated workflows to run their models as well as how to best communicate the output to less technical stakeholders. Tableau can solve both of these challenges by designing R workflows and creating visualizations that break complicated models down into easily understandable stories.**Register today**.

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions