Single regression on Exxon's stock
[Introduction of Multi-regression]
Let's recall our last job. We conducted the single regression on Exxon Mobil's stock along with WTI crude oil spot price. The result was fantastic, which accounts for 25% of the variation of stock movement. Put it in other way, R-square. The problem is "are you happy with the…
As part of Data Science tutorial Series in my previous post I posted on basic data types in R. I have kept the tutorial very simple so that beginners of R programming may takeoff immediately.
Please find the online R editor at the end of the post so that you can execute the code on the page itself.
In this section we learn about control structures loops used…
Added by dataperspective on May 18, 2016 at 8:30pm — No Comments
Contributed by the n…Continue
Added by NYC Data Science Academy on April 12, 2016 at 3:00pm — No Comments
Contributed by Bin Lin. He took NYC Data Science Academy 12 week full-time Data Science Bootcamp programbetween Jan 11th to Apr 1st, 2016. The post was based on his…Continue
Added by NYC Data Science Academy on April 12, 2016 at 1:30pm — No Comments
Machine Learning? Data Mining?
Well, there is a little bit difference between machine learning and data mining although I don't see any difference between them.
See the Stackexchange debate on the difference between machine learning and data mining.
At the end, it is about training the machine to…
Added by Gregory Choi on April 7, 2016 at 4:30pm — No Comments
[The goal of this page]
When I have read all R introductions, the books were filled with just instructions. The goal of R is to solve our real life problem. That's why I want to minimize this page. In the real though, we need to understand some key concepts that might be useful for you to tackle the real life problem. Here's basic data structures and data manipulation method.
Still, I believe the best way to learn R programming language is to tackle the real life…
Have you ever wondered how to segment your customers? Customer segmentation is a really useful technique to group similar customers together and understand what works for that. You can then tailor your offering and marketing messages to the specific segments. If you do it right, you should be able to see a healthy increase in sales. After all, companies like Amazon target their customers on an individual level so you should at least be targeting them on a segment level.…Continue
Regression is the first technique you’ll learn in most analytics books. It is a very useful and simple form of supervised learning used to predict a quantitative response.
Originally published on Ideatory…
Added by Sudhanshu Ahuja on March 28, 2016 at 8:00pm — No Comments
Recently, I came across with an interesting book on the statistics which has a narration of Ugly Duckling story and correlation of this story with today's DATA or rather BIG DATA ANALYTICS world. This story originally from famous storyteller Hans Christian Andersen
Story goes like this...
The duckling was a big ugly grey bird, so ugly that even a dog would not bite him. The poor duckling…
Added by Manish Bhoge on January 31, 2016 at 12:00pm — No Comments
How to have our basic statistics (Mean, Median, SD, Var, Cor, Cov) computed using R language?
The dataottam team has come up with blog sharing initiative called “Celebrate the Big Data Problems”. In this series of blogs we will share our big data problems using CPS (Context, Problem, Solutions) Framework.
In statistics Mean, Median, Standard Deviations, Variance, Correlation, or…Continue
Added by Kumar Chinnakali on January 30, 2016 at 2:30am — No Comments
Ggplot2 is the most elegant and aesthetically pleasing graphics framework available in R. It has a nicely planned structure to it. This tutorial focusses on exposing this underlying structure you can use to make any ggplot. But, the way you make plots in ggplot2 is very different from base graphics making the learning curve steep. So leave what you know about base graphics behind and follow along. You are just 5…Continue
Many Machine Learning articles and papers describe the wonders of the Support Vector Machine (SVM) algorithm. Nevertheless, when using it on real data trying to obtain a high accuracy classification, I stumbled upon several issues.
I will try to describe the steps I took to make the algorithm work in practice.
This model was implemented…
Added by Renata Ghisloti Duarte Souza Gra on December 18, 2015 at 5:00pm — No Comments
Learning any new skill is hard. There are too many possibilities, and the goal seems massive and intimidating.
Enter the Pareto Principle.
The Pareto Principle, also known as the 80/20 rule, suggests that 80 percent of results come from 20 percent of efforts. It can be applied to everything from business to language, even learning how to use R.
With just a…Continue
Added by Divya Parmar on December 3, 2015 at 8:42am — No Comments
Building off my last post, I want to use the same healthcare data to demonstrate the use of R packages. Packages in R are stored in libraries and often are pre-installed, but reaching the next level of skill requires being able to know when to use new packages and what they contain. With that let’s get to our example.
Added by Divya Parmar on November 10, 2015 at 12:36pm — No Comments
Wouldn’t it be great if you knew exactly what the hiring manager will ask you at your next R and Data science interview?
Well frankly, we can’t do just that but we can give you the next best thing which a list of the 16 most commonly asked interview questions and the answers you should give.
We’ve gathered these question from interviewers and people who have been on an R or Data Science interview.
Please note that…
Added by T Miterany on November 4, 2015 at 9:30pm — No Comments
The Internet Governance (IG) Barometer methodology presents a quantitative summary of the main developments in the Internet Governance arena based on computational text and data-mining approaches. The IG Barometer is based on the statistical modeling of large collections of textual documents: thus, it essentially presents an advanced discourse processing system. These collections - called text corpora - are obtained by querying various online media sources with…Continue
Added by Goran S. Milovanović on October 20, 2015 at 11:00pm — No Comments
I have written about R in the past, and it is one of the hottest tools for data analysis today. To further demonstrate the power of R, I found click-through rate data…Continue
DataJoy is an unbelievably fantastic way for a working data scientist to have their favorite tools at hand. I am a minimalist when it comes to being mobile, whether working on the road, traveling for leisure, and sometimes both. I do not like to keep files on my laptop and I do not, for the most part, like to worry about keeping updated applications on my laptop. I have tried as much as possible to push my life into the cloud. Yes, I travel with a chromebook. Yes, I use…Continue
Added by Dr. William Tribbey on September 19, 2015 at 10:00am — No Comments
Statistical analysis and data mining were the top skills that got people hired in 2014 based on LinkedIn analysis of 330 million LinkedIn member profiles. We live in an increasingly data-driven world, and businesses are aggressively hiring experts in data storage, retrieval, and analysis. Across the globe, statistics and data analysis skills were highly valued. In the US, India, and France, those skills are in particularly high demand.