R is now used by over 50% of data miners. R, Python, and SQL were the most popular programming languages. Python, Lisp/Clojure, and Unix tools showest the highest growth in 2012, while Java and MATLAB slightly declined in popularity.
Sometimes, the high-level visual GUI of your favorite data mining tool is not enough and you need to code an algorithm or more frequently some data wrangling / cleaning process.
Latest KDnuggets Poll asked "What programming/statistics languages you used for analytics / data mining in the past 12 months?"
On average, KDnuggets readers used 2.5 languages, with R, Python, and SQL being most popular ones, with highest growth in Lisp/Clojure(*), Python, and Unix tools. R is now used by over 50% of data miners. However, Hadoop-based languages were used by only about 7% of voters.
Comparing with 2011 KDnuggets Poll: What languages you used for data mining / data..., the languages with the highest growth were
- Lisp/Clojure, 525% increase, to 4.4% in 2012 (for Lisp/Clojure) from 0.7% in 2011 (*) (for Lisp only, so results not fully comparable)
- Python, 49% increase, to 36.5%, from 24.6%
- Unix shell/awk/sed, 44% increase, to 14.5%, from 10.4%
- R, 16% increase, to 52.5%, from 45.1%.
The languages with the declining number of users were Java (down 12%) and MATLAB (down 10%).
Most popular language used along with R was Python (and vice versa).
Here are the results: http://www.kdnuggets.com/2012/08/poll-analytics-data-mining-program...
Related article: Rexer Analytics' 5th Annual Data Miner Survey Summary