Subscribe to DSC Newsletter

All Blog Posts Tagged 'R' (93)

What are the Big Guys Using?

Summary:  The largest companies utilizing the most data science resources are moving rapidly toward more integrated advanced analytic platforms.  The features they are demanding are evolving to promote speed, simplicity, quality, and manageability.  This has some interesting implications for open source R and Python widely taught in schools but significantly less necessary with these more sophisticated platforms.

 …

Continue

Added by William Vorhies on December 20, 2016 at 8:38am — 6 Comments

Who Made the News? Text Analysis using R, in 7 steps

This post covers the following tasks using R programming:

  • cleans the texts,
  • sorts and aggregates by publisher names
  • creates word clouds and word…
Continue

Added by Ann Rajaram on November 26, 2016 at 3:30am — 5 Comments

R for SQListas (1): Welcome to the Tidyverse

R for SQListas, what's that about?

This is the 2-part blog version of a talk I've given at DOAG Conference this week. I've also uploaded the slides (no ppt; just pretty R presentation ;-) ) to the articles section, but if you'd like a little text I'm encouraging you to read on. That is, if you're in the target group for this…

Continue

Added by Sigrid Keydana on November 17, 2016 at 11:30pm — No Comments

R, Python or SAS: Which one should you learn first?

Python, R and SAS are the three most popular languages in data science. If you are new to the world of data science and aren’t experienced in either of these languages, it makes sense to be unsure of whether to learn R, SAS or Python.

kd-nuggets-poll-2014-programming-languages

Don’t fret, by the time you’re done reading this article, you will know without a doubt which language is the right one for you.

Overview…

Continue

Added by Aatash Shah on November 1, 2016 at 9:30pm — 8 Comments

Don’t run afoul of Scoping Rules in R!

Whether you are a veteran programmer with experience dating back to Fortran, or a new college grad with all the latest technologies, if you use R eventually you will have to worry about scoping!

Sure, we all start out ignoring scoping when we first begin using a new language. So what if all your variables and functions are global - you are the only one using them, right?!?! Unless you give up on R, you will eventually grow beyond your own system - either having to share your code with…

Continue

Added by Connie Brett, Ph.D. on September 8, 2016 at 12:30pm — No Comments

[Data Mining] Association Rules in R (diapers and beer)

[Introduction of Association Rules]



Sometimes, the anecdotal story helps you understand the new concept. But, this story is real. About 15 years ago, in Walmart, a sales guy made efforts to boost sales in his store. His idea was simple. He bundled the products together and applied some discounts to the bundled products. (Now, it became common practices in marketing) For example, this guy bundled bread with jam, so that customers easily found them together. Moreover,…

Continue

Added by Gregory Choi on August 22, 2016 at 7:30am — 5 Comments

Map the Life Expectancy in United States with data from Wikipedia with R

Original post is published at DataScience+

Recently, I become interested to grasp the data from webpages, such as Wikipedia, and to visualize it with R. As I did in my previous post, I use rvest package to get the data from webpage and…

Continue

Added by Klodian on August 5, 2016 at 10:30pm — No Comments

Characteristics of Good Visual Analytics and Data Discovery Tools

Visual Analytics and Data Discovery allow analysis of big data sets to find insights and valuable information. This is much more than just classical Business Intelligence (BI). See this article for more details and motivation: "Using Visual Analytics to Make Better Decisions: the Death Pill Example". Let's take a look at important characteristics to choose the right tool for…

Continue

Added by Kai Waehner on July 27, 2016 at 10:00pm — No Comments

Expand Machine Learning Tools (Part2): Toree Scala and Python in Jupyter Notebook

Data Analytics favorite Apache Spark,  is progressing as a reference standard for Big Data, and a “fast and general engine for large-scale data processing”. In our previous post, we detailed how to expand ML tools using a PySpark kernel and leverage the …

Continue

Added by Marc Borowczak on June 9, 2016 at 10:30am — No Comments

Picking an Analytic Platform



Summary:
Picking an analytic platform when first starting out in data science almost always means working with what we’re most comfortable.  But as organizations grow larger there is a need for standardization and for selecting one, or a few analytic tools.

 

Picking an analytic platform when first starting out in data science almost…

Continue

Added by William Vorhies on May 31, 2016 at 7:00am — 1 Comment

San Francisco Police Department Crime Incidents: Part 1-Time Series Analysis

Introduction



The City and County of San Francisco had launched an official open data portal called SF OpenData in 2009 as a product of its official open data program, DataSF. The portal contains hundreds of city datasets for use by developers, analysts, residents and more. Under the category of Public Safety, the portal contains the list of SFPD Incidents since Jan 1, 2003.

In this post I have done an exploratory time-series analysis on the crime incidents dataset to see…

Continue

Added by Vimal Natarajan on May 30, 2016 at 7:42am — No Comments

Multi-Regression in R (Exxon Mobil stock price ~ WTI, Gas, and S&P500)

[Previous Post]

Single regression on Exxon's stock



[Introduction of Multi-regression]



Let's recall our last job. We conducted the single regression on Exxon Mobil's stock along with WTI crude oil spot price. The result was fantastic, which accounts for 25% of the variation of stock movement. Put it in other way, R-square. The problem is "are you happy with the…

Continue

Added by Gregory Choi on May 20, 2016 at 9:05am — 1 Comment

Control Structures Loops in R

As part of Data Science tutorial Series in my previous post I posted on basic data types in R. I have kept the tutorial very simple so that beginners of R programming  may takeoff immediately. 

Please find the online R editor at the end of the post so that you can execute the code on the page itself.

In this section we learn about control structures loops used…

Continue

Added by dataperspective on May 18, 2016 at 8:30pm — No Comments

Investigating Airport Connectedness

Contributed by the n…

Continue

Added by NYC Data Science Academy on April 12, 2016 at 3:00pm — No Comments

Food Price Flow Visualization with Shiny App

Contributed by Bin Lin. He took NYC Data Science Academy 12 week full-time Data Science Bootcamp programbetween Jan 11th to Apr 1st, 2016. The post was based on his…

Continue

Added by NYC Data Science Academy on April 12, 2016 at 1:30pm — No Comments

Introduction to Machine Learning / Data Mining

Machine Learning? Data Mining?



Well, there is a little bit difference between machine learning and data mining although I don't see any difference between them.

See the Stackexchange debate on the difference between machine learning and data mining.



At the end, it is about training the machine to…

Continue

Added by Gregory Choi on April 7, 2016 at 4:30pm — No Comments

R tutorial (R programming basic 101)

[The goal of this page]

When I have read all R introductions, the books were filled with just instructions. The goal of R is to solve our real life problem. That's why I want to minimize this page. In the real though, we need to understand some key concepts that might be useful for you to tackle the real life problem. Here's basic data structures and data manipulation method.



Still, I believe the best way to learn R programming language is to tackle the real life…

Continue

Added by Gregory Choi on April 6, 2016 at 8:53am — 4 Comments

Find Marketing Clusters in 20 minutes in R

Have you ever wondered how to segment your customers? Customer segmentation is a really useful technique to group similar customers together and understand what works for that. You can then tailor your offering and marketing messages to the specific segments. If you do it right, you should be able to see a healthy increase in sales. After all, companies like Amazon target their customers on an individual level so you should at least be targeting them on a segment level.…

Continue

Added by Sudhanshu Ahuja on April 2, 2016 at 10:34pm — 6 Comments

How to forecast using Regression Analysis in R

Regression is the first technique you’ll learn in most analytics books. It is a very useful and simple form of supervised learning used to predict a quantitative response.



Originally published on Ideatory…

Continue

Added by Sudhanshu Ahuja on March 28, 2016 at 8:00pm — No Comments

Big Data Analytics: From Ugly Duckling to Beautiful Swan

Recently, I came across with an interesting book on the statistics which has a narration of Ugly Duckling story and correlation of this story with today's DATA or rather BIG DATA ANALYTICS world. This story originally from famous storyteller Hans Christian Andersen

Story goes like this... 

The duckling was a big ugly grey bird, so ugly that even a dog would not bite him. The poor duckling…

Continue

Added by Manish Bhoge on January 31, 2016 at 12:00pm — No Comments

Monthly Archives

2019

2018

2017

2016

2015

2014

2013

2012

2011

1999

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service