What is R? R Explained in less than Two Minutes, to Absolutely Anyone

If you're looking at ways you can harness the power of Big Data analytics in your business, but are not necessarily a techie person yourself, it can be a confusing field at first.

For this reason I'm publishing a series of short posts aimed at explaining some of the key concepts and technologies behind Big Data and data analytics, aimed at an audience which is not primarily composed of IT specialists or data scientists.

I firmly believe that any business can benefit from the new wave of analytics applications and services which can crunch through as much data as you can throw at them, in order to come out with surprising and valuable insights to drive growth.

These projects usually require a mix of skills, and communication between people with different skillsets (i.e data science and marketing) is essential. So in this post I'll give an overview of R  -the programming language favored by many statisticians.

R is a computer programming language which is particularly well suited to handling and sorting the large datasets associated with Big Data projects.

Like Python which I covered previously, the software environment used to create code in R is open sourced, meaning it is free to download, anyone can use it, and there is a plethora of guidance and advice available on how to use it most effectively. However commercial distributions are also available, which often offer additional proprietary functionality or support packages.

Named from the initials of the two men who first developed the language at the University of Auckland, Robert Gentleman and Ross Ihaka, R has become very popular in recent years and is continuing to become more so, due to the explosion in analytic activities being carried out by business.

R's strengths as a statistical programming language draw from the fact it is designed from the ground up to facilitate matrix arithmetic - carrying out complex, often automated calculations on data which is held in a grid of rows and columns. R is very good for creating programs which can carry out calculations on these datasets, even when the datasets are constantly growing in size at an ever-increasing rate, and producing real-time visualizations based on this data.  

Its capability at producing these visualizations is another core strength of R. Its designers realised that visualization was key to being able to understand the complex datasets that are being explored, so incorporated functionality to translate data into charts, graphs and complex multi-dimensioned matrices - as well as many user-defined methods of visualization - into its core.

Online, R code is everywhere although you won't see it, as it's always hidden behind pretty graphical interfaces. But when you use Google, Facebook or Twitter you are almost certainly executing R code running on the servers of those organizations. In fact it is often cited as the most widely used programming language for data science. APIs exist for almost all of these services, allowing applications written in R to access data from these outside sources and include it in their own analytics routines.

Thanks to this huge user base, just about every function that you might need for data analysis is available, often through open source extensions (known as packages) made available by the community. It is also capable of executing code written in other languages such as C++ or Java, so resources coded in those languages can be made available. Because it can be compiled to run on any major operating system, R code can easily be ported between Unix, Windows or Mac environments.

Python is probably R's biggest rival - but as both are non-commercial entities (as are most languages, computer or otherwise!) it's not necessarily a rivalry in the traditional sense. However coders will often argue vociferously for their favorite of the two. Python, having more in common with more traditional, longer established programming languages, is often cited as being easier to learn, particularly for someone with prior experience of different high-level programming languages. The R environment, on the other hand, is likely to be more familiar to someone with an academic background in statistics.  It's worth noting that Python tends to have a wider range of uses outside of the world of statistics and analytics, whereas R is generally exclusively used for those purposes.

With a reported two million users worldwide, and thousands of deployed applications created using it, R is undoubtedly one of the backbone technologies of the Big Data revolution. If you are thinking of getting involved with the techie end of data analysis, then a thorough grounding in the language should be considered an essential element of your toolbox. If you want to learn more, or have a go at creating your own code in R to see what it can do, there are plenty of great resources online, such as those at Coursera, Code School and R Studio .

You might also want to read:

DSC Resources

Additional Reading

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge

Views: 39855


You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Vijay Singh on April 28, 2019 at 9:06pm

Nice article!

I found something which I was browsing the web to learn about coding algorithms, It was very difficult to visit different sites. I found one site name https://hackr.io 

I found great stuff for the best programming sites were all located on a single page. I thought to share with you all

I hope it will helps you
This might be useful to your readers: https://hackr.io/tutorials/learn-r

Comment by Mahesh Yadav on February 12, 2017 at 3:01am

Nice article. Did not know Google and Facebook runs R when online users make some request.

Comment by Misty Manson on June 28, 2016 at 8:44am
Right On! Delete Fourth Eye.
Comment by Miroljub Zivkovic on December 26, 2015 at 1:57pm

Great Article!

Comment by Mukesh N. Shende on November 24, 2015 at 7:57pm

Graph is amazing!!! Thanks.

Comment by David Dávila on November 17, 2015 at 11:10am

That graph is amazing, excelent info!

Comment by Sione Palu on November 16, 2015 at 12:25pm

R language was created here at University of Auckland, New Zealand by Ross Ihaka and Robert Gentleman. Now it has gone global.

Comment by Parker LAU on November 14, 2015 at 5:42am

How could Python and R be rival? They served completely different purpose. Thanks for the graph, it is very informatic.

Comment by Justin Veenstra on November 13, 2015 at 1:58pm
Of course, R is also named R because it was based on S (really a GNU replacement.) I guess it might have been named T if they had had different first initials...

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service