Deep Learning gets more and more traction. It basically focuses on one section of Machine Learning: Artificial Neural Networks. This article explains why Deep Learning is a game changer in analytics, when to use it, and how Visual Analytics allows business analysts to leverage the analytic models built by a (citizen) data scientist.
Deep Learning is the modern buzzword for artificial neural networks, one of many concepts…Continue
Added by Goran S. Milovanović on April 14, 2017 at 11:00pm — No Comments
Last Sunday at Trivadis Tech Event, I talked about R for Hackers. It was the first session slot on Sunday morning, it was a crazy, nerdy topic, and yet there were, like, 30 people attending! An emphatic thank you to everyone who came!
R a crazy, nerdy topic, - why that, you'll be asking? What's so nerdy about using R?
Well, it was about R. But it was neither an introduction ("how to get things done quickly with R"), nor was it even about data science. True, you…
Added by Sigrid Keydana on March 24, 2017 at 2:30am — No Comments
Illegal, Unreported and Unregulated (IUU) fishing is becoming a major issue around the world . In general, IUU fishing is a broad term encapsulating many different scenarios (i.e. illegal: breaking laws, unreported: Not reporting catch, which may not be illegal, Unregulated: fishing in ways or places where there are no laws). For the purposes of this blog,…Continue
Added by Grant Humphries on March 9, 2017 at 2:30am — No Comments
As we all know CRISP DM stands for Cross Industry Standard Process for Data Mining is a process model that outlines the most common approach to tackle data driven problems. Per the poll conducted by KDNuggets in 2014 this was and “is” one of the most popular and widest used methodology. This method of gleaning insights out of the data is very dear to the industry experts and data miners.
As the title suggest I will align some of the most useful R packages with this most popular and…Continue
Added by John Mount on February 4, 2017 at 3:30pm — No Comments
R is a free programming language for data analysis, statistical modeling and visualization. It is one of the most popular tool in predictive modeling world. Its popularity is getting better day by day. In 2016 data science salary survey conducted by O'Reilly, R was ranked second in a category of programming languages for data science (SQL ranked first). In another popular KDnuggets Analytics software survey poll, R scored top rank with 49% vote. These survey polls answers the question about…Continue
Added by Deepanshu Bhalla on January 1, 2017 at 9:30am — No Comments
Regressions are widely used to estimate relations between variables or predict future values for a certain dataset.
If you want to know how much of variable "x" interferes with…Continue
Added by Renata Ghisloti Duarte Souza Gra on December 27, 2016 at 10:00am — No Comments
Summary: The largest companies utilizing the most data science resources are moving rapidly toward more integrated advanced analytic platforms. The features they are demanding are evolving to promote speed, simplicity, quality, and manageability. This has some interesting implications for open source R and Python widely taught in schools but significantly less necessary with these more sophisticated platforms.
This post covers the following tasks using R programming:
This is the 2-part blog version of a talk I've given at DOAG Conference this week. I've also uploaded the slides (no ppt; just pretty R presentation ;-) ) to the articles section, but if you'd like a little text I'm encouraging you to read on. That is, if you're in the target group for this…
Added by Sigrid Keydana on November 17, 2016 at 11:30pm — No Comments
Python, R and SAS are the three most popular languages in data science. If you are new to the world of data science and aren’t experienced in either of these languages, it makes sense to be unsure of whether to learn R, SAS or Python.
Don’t fret, by the time you’re done reading this article, you will know without a doubt which language is the right one for you.
Whether you are a veteran programmer with experience dating back to Fortran, or a new college grad with all the latest technologies, if you use R eventually you will have to worry about scoping!
Sure, we all start out ignoring scoping when we first begin using a new language. So what if all your variables and functions are global - you are the only one using them, right?!?! Unless you give up on R, you will eventually grow beyond your own system - either having to share your code with…Continue
Added by Connie Brett, Ph.D. on September 8, 2016 at 12:30pm — No Comments
[Introduction of Association Rules]
Sometimes, the anecdotal story helps you understand the new concept. But, this story is real. About 15 years ago, in Walmart, a sales guy made efforts to boost sales in his store. His idea was simple. He bundled the products together and applied some discounts to the bundled products. (Now, it became common practices in marketing) For example, this guy bundled bread with jam, so that customers easily found them together. Moreover,…
Original post is published at DataScience+
Recently, I become interested to grasp the data from webpages, such as Wikipedia, and to visualize it with R. As I did in my previous post, I use
rvest package to get the data from webpage and…
Added by Klodian on August 5, 2016 at 10:30pm — No Comments
Visual Analytics and Data Discovery allow analysis of big data sets to find insights and valuable information. This is much more than just classical Business Intelligence (BI). See this article for more details and motivation: "Using Visual Analytics to Make Better Decisions: the Death Pill Example". Let's take a look at important characteristics to choose the right tool for…Continue
Added by Kai Waehner on July 27, 2016 at 10:00pm — No Comments
Data Analytics favorite Apache Spark, is progressing as a reference standard for Big Data, and a “fast and general engine for large-scale data processing”. In our previous post, we detailed how to expand ML tools using a PySpark kernel and leverage the …Continue
Added by Marc Borowczak on June 9, 2016 at 10:30am — No Comments
Summary: Picking an analytic platform when first starting out in data science almost always means working with what we’re most comfortable. But as organizations grow larger there is a need for standardization and for selecting one, or a few analytic tools.
The City and County of San Francisco had launched an official open data portal called SF OpenData in 2009 as a product of its official open data program, DataSF. The portal contains hundreds of city datasets for use by developers, analysts, residents and more. Under the category of Public Safety, the portal contains the list of SFPD Incidents since Jan 1, 2003.
In this post I have done an exploratory time-series analysis on the crime incidents dataset to see…
Added by Vimal Natarajan on May 30, 2016 at 7:42am — No Comments
Single regression on Exxon's stock
[Introduction of Multi-regression]
Let's recall our last job. We conducted the single regression on Exxon Mobil's stock along with WTI crude oil spot price. The result was fantastic, which accounts for 25% of the variation of stock movement. Put it in other way, R-square. The problem is "are you happy with the…