*Originally posted by Vaishnavi Agrawal.*

Did you know that Python’s usage in data science applications rose 51% in 2015? Did you know that youtube is heavily built on Python language consisting of over a million lines of code? Tech visionaries are predicting that it might soon overtake R and may well be the most popular language in data science industry. R is a language dedicated to statistics and data science but Python is a general purpose programming language. Inspite of this fact it is widely opted in data science environment. Recently, in tech forums there was a huge ruckus when Google mainly used Python in creating its framework of deep learning called tensorflow. They hailed Python as the future of high end computing. Besides, Facebook developers heavily make use of the language in their production environment.

**Reasons for opting Python in data science**

- Advantage of scalability – Python is highly scalable and is faster than languages like Stata and Matlab. The flexibility with which the code can be designed is the reason why Python is more scalable than R. For quick development of applications this is the language that is most popularly used.
- Powerful packages – There are many packages a data scientist can choose to develop his applications. SciPy is used for scientific computing, NumPy is used for mathematical computing, Pandas is popular in data manipulation. Along with the mentioned packages StatsModels, SciKit-Learn are also used in data science. Its packages are updated in time and hence if there was a problem in any aspect of language then it would most likely be resolved.
- Easy to learn – The main reason why it is popularly used is the fact that it is easy to learn and easier to develop. Compared to R, its syntax is very easy to follow. It is one of very few languages which is both simple and efficient.
- Visualization and graphics – Pandas plotting, seaborn, and ggplot libraries have been built with Matplotlib which acts as a common foundation for them. Ggplot is based on ‘Grammar of graphics’ and is a very important data visualization library. To build informative statistical graphics Seaborn is used and Altair is based on visualization grammar of Vega-Lite. Bokeh is used to provide graphical representations which are interactive. Pygal is used to build plots.
- Huge online community – Due to the virtue of being a general purpose programming language Python has a very huge online community which provides robust support to the development of language. Eager developers find highly relevant solutions to their coding woes from an involved community. StackOverflow and Codementor are forums where coders can derive such benefits.

**Will Python be relevant in the coming future?**

There is a saying that Python is not suitable for powering core infrastructures which are large scale. Others complain that code documentation is not up to the mark. Inspite of these issues it is one of the top ten mostly used programming languages. It is so for a reason that its benefits far outweigh its shortcomings. Bank of America is building interfaces, crunching data and developing new products using this language very successfully. Data scientists who have to deploy their applications on web prefer Python. Though R is the most used data science language Python is soon catching up and may well overtake it in the future.

© 2019 Data Science Central ® Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central