Social media platforms such as Twitter and Facebook enable everyone to voice their opinions about topics, companies, and products online.
These comments are a great source for companies to analyze their customers’ opinion about their brand or product. However, with billions of Tweets and posts daily, this is can take a lot of time.
Unless of course, you use R J With just a few lines of R-code and the help of machine learning, we’re able to build mood monitoring tools quickly,…Continue
Added by Daniel Schmeh on September 7, 2017 at 9:30am — No Comments
Principal Component Analysis (PCA) is a technique used to find the core components that underlie different variables. It comes in very useful whenever doubts arise about the true origin of three or more variables. There are two main methods for performing a PCA: naive or less naive. In the naive method, you first check some conditions in your data which will determine the essentials of the analysis. In the less-naive method, you set the those yourself,…Continue
Added by Pablo Bernabeu on September 6, 2017 at 1:30pm — No Comments
PostgreSQL is a commonly used database language for creating and managing large amounts of data effectively.
Here, you will see how to:
1) create a PostgreSQL database using the Linux terminal
2) connect the PostgreSQL database to R using the “RpostgreSQL” library
In this example, we are going to create a simple database containing a table of dates, cities, and average temperature in degrees (Celsius).
We will name…Continue
Added by Michael Grogan on August 7, 2017 at 7:30am — No Comments
DVC is an open source tool that could help with achieving code simplicity, readability and faster model development.The idea is to track files/data dependencies during model development in order to facilitate reproducibility and track data files versioning. However, DVC is a language agnostic tool and can be used with any programming language. Here we will describe how we can…Continue
Added by Marija Zoldin on August 4, 2017 at 12:30am — No Comments
Added by Sandipan Dey on July 31, 2017 at 4:00am — No Comments
First of all we will see what is R Clustering, then we will see the Applications of Clustering, Clustering by Similarity Aggregation, use of R amap Package, Implementation of Hierarchical Clustering in R and examples of R clustering in various fields.
2. Introduction to Clustering in…Continue
Added by Sheetal Sharma on July 19, 2017 at 9:00pm — No Comments
Visualization apps may be privately consulted as well as published online. There are two main platforms: R Shiny and Tableau. Shiny has a free starter license…Continue
Added by Pablo Bernabeu on July 15, 2017 at 4:00am — No Comments
R language is the world's most widely used programming language for statistical analysis, predictive modeling and data science. It's popularity is claimed in many recent surveys and studies. R programming language is getting powerful day by day as number of supported packages grows. Some of big IT companies such as Microsoft and IBM have also started developing packages on R and offering enterprise version of R.
Added by Deepanshu Bhalla on June 12, 2017 at 12:30am — No Comments
This article explains how to select important variables using boruta package in R. Variable Selection is an important step in a predictive modeling project. It is also called 'Feature Selection'. Every private and public agency has started tracking data and collecting information of various attributes. It results to access to too many predictors for a predictive model. But not every variable is important for prediction of a particular task. Hence it is essential to…Continue
About two months ago there was new SaaS product, the Keyword Hero. It’s the only solution to “decrypt” the organic keywords in Google Analytics that users searched for in order to get to one’s website. We do so by buying lots of data off sources such as plugins and matching the data with our customers’ sessions in Google Analytics (side note: the entire algorithm was coded in R before we refactored it in Python to allow scalability and operability with AWS).
Added by Daniel Schmeh on May 29, 2017 at 10:00am — No Comments
Summary: Someone had to say it. In my opinion R is not the best way to learn data science and not the best way to practice it either. More and more large employers agree.
This is a tutorial to show how to implement dashboards in R, using the new "flexdashboard" library package.
this new library leverages these libraries and allows us to create some stunning dashboards, using interactive graphs and text. What I loved the most, was the “storyboard” feature that allows me to present content in Tableau-style frames. Please note that for this you need to create RMarkdown (.Rmd) files and insert the code using the…Continue
Deep Learning gets more and more traction. It basically focuses on one section of Machine Learning: Artificial Neural Networks. This article explains why Deep Learning is a game changer in analytics, when to use it, and how Visual Analytics allows business analysts to leverage the analytic models built by a (citizen) data scientist.
Deep Learning is the modern buzzword for artificial neural networks, one of many concepts…Continue
Added by Goran S. Milovanović on April 14, 2017 at 11:00pm — No Comments
Last Sunday at Trivadis Tech Event, I talked about R for Hackers. It was the first session slot on Sunday morning, it was a crazy, nerdy topic, and yet there were, like, 30 people attending! An emphatic thank you to everyone who came!
R a crazy, nerdy topic, - why that, you'll be asking? What's so nerdy about using R?
Well, it was about R. But it was neither an introduction ("how to get things done quickly with R"), nor was it even about data science. True, you…
Added by Sigrid Keydana on March 24, 2017 at 2:30am — No Comments
Illegal, Unreported and Unregulated (IUU) fishing is becoming a major issue around the world . In general, IUU fishing is a broad term encapsulating many different scenarios (i.e. illegal: breaking laws, unreported: Not reporting catch, which may not be illegal, Unregulated: fishing in ways or places where there are no laws). For the purposes of this blog,…Continue
Added by Grant Humphries on March 9, 2017 at 2:30am — No Comments
As we all know CRISP DM stands for Cross Industry Standard Process for Data Mining is a process model that outlines the most common approach to tackle data driven problems. Per the poll conducted by KDNuggets in 2014 this was and “is” one of the most popular and widest used methodology. This method of gleaning insights out of the data is very dear to the industry experts and data miners.
As the title suggest I will align some of the most useful R packages with this most popular and…Continue
Added by John Mount on February 4, 2017 at 3:30pm — No Comments
R is a free programming language for data analysis, statistical modeling and visualization. It is one of the most popular tool in predictive modeling world. Its popularity is getting better day by day. In 2016 data science salary survey conducted by O'Reilly, R was ranked second in a category of programming languages for data science (SQL ranked first). In another popular KDnuggets Analytics software survey poll, R scored top rank with 49% vote. These survey polls answers the question about…Continue
Added by Deepanshu Bhalla on January 1, 2017 at 9:30am — No Comments
Regressions are widely used to estimate relations between variables or predict future values for a certain dataset.
If you want to know how much of variable "x" interferes with…Continue
Added by Renata Ghisloti Duarte Souza Gra on December 27, 2016 at 10:00am — No Comments