Emerging applications like machine learning (ML), big data analytics, and artificial intelligence (AI) has created the need for many companies to hire highly skilled and experienced work force. Demand for data scientists, ML engineers and data engineers is booming and will only increase in the next years. The January report from Indeed, one of the top job sites, showed a 29% increase in demand for data scientists year over year and a 344% increase since 2013.

**Salaries and OpEx**

At the same time the personnel cost of this skilled employees is increasing rapidly. According to Indeed, the average salary for a machine learning engineer is about $145,000 per year. According to Glassdoor, a data scientist role with a median salary of $110,000 is now the hottest job in America [1].

As the demand for data scientists and machine learning engineers grows, you can also expect these numbers to rise posing a significant challenge in many companies. Therefore, it is very important for the companies to provide to these engineers the best available tools to help them be more efficient.

**ML tasks**

Training machine learning models is one of the typical tasks of ML engineers and it is the one that consume significant amount of their time. In a typical machine learning application, practitioners must apply the appropriate data pre-processing, feature engineering, feature extraction, and feature selectionmethods that make the dataset amenable for machine learning. Following those preprocessing steps, practitioners must then perform algorithm selection and hyperparameter optimization to maximize the predictive performance of their final machine learning model.

In many cases, automated ML tools can be used to automate the end-to-end process of applying machine learning to real-world problems trying to find automatically the hyperparameter optimization. Auto ML, although very useful takes many hours to complete.

As the time to train the model is very important for the engineers in order to find the optimum solution, it is crucial to be able to run these tasks very fast in order these highly skilled engineers to be efficient and productive.

**The rise of specialized accelerators**

Typical processors provide high flexibility but the lack of performance. According to David Patterson the Domain Specific accelerators, like FPGAs, is the only path left to keep increasing the performance of computing systems for applications like Machine Learning.

Specialized Accelerators like FPGAs can provide up to 20x speedup compared to typical processors and at the same time are more energy-efficient and cost-efficient than GPUs and CPUs. That’s why, cloud providers like AWS, Alibaba and Huawei have started deploying FPGAs in their data centers that are available to the public.

**Use case: Training on logistic regression — 15x more models**

In this use case, we show how ML engineers can be more efficient and more productive for a typical real world example using FPGA-based accelerators on the cloud. The training of logistic regression for the large MNIST dataset(libsvm format) takes around 18.7 minutes in a typical processor on aws with 16 cores and the cost is $1.15/hour. On the other hand, the ML engineer can train the same model in just 1.2 minutes (**15x faster**) using the FPGA-accelerated instances (f1.4x) that costs $3.3/hour. That means that the ML engineer can train 15x more models without any changes in his code.

**Cost saving**

The use of FPGA-based accelerators has several benefits also in terms of cost. On a yearly basis the use of 10 servers (r5d.4x: $1.15/hour) for the training costs around $24k (assuming 8 hours per day for 262 working days). Using the f1 instances the costs drops to $8.9k ($3.3/hour for the f1 instances and $3/hour for the InAccel accelerated ML suite). That means more than 2.5x cost savings.

At the same time, ML engineers can be 15x more productive as they can test 15x more models at the same time. Assuming a salary of $145k per year and assuming that training takes 33% of his/her time that means the productivity using accelerators can increase by a 5x. A group of 5 ML engineers costs $725k in the company. By using the hardware accelerators, the ML team can be more productive and the company can save more than 580k/year in salaries.

Views: 728

Tags: accelerators, cloud, fpga, hardware, learning, logistic, machine, regression, speedup

© 2020 Data Science Central ® Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Upcoming DSC Webinar**

**Your Model Will Probably Fail (And How to Prevent it)**- July 9

Data science is more popular than ever, but many data scientists struggle with complicated workflows to run their models as well as how to best communicate the output to less technical stakeholders. Tableau can solve both of these challenges by designing R workflows and creating visualizations that break complicated models down into easily understandable stories.**Register today**.

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Upcoming DSC Webinar**

**Your Model Will Probably Fail (And How to Prevent it)**- July 9

Data science is more popular than ever, but many data scientists struggle with complicated workflows to run their models as well as how to best communicate the output to less technical stakeholders. Tableau can solve both of these challenges by designing R workflows and creating visualizations that break complicated models down into easily understandable stories.**Register today**.

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central