Subscribe to DSC Newsletter

Speedup your Machine Learning applications without changing your code

Emerging cloud applications like machine learning, AI and big data analytics require high performance computing systems that can sustain the increased amount of data processing without consuming excessive power. Towards this end, many cloud operators have started adopting heterogeneous infrastructures deploying hardware accelerators, like FPGAs, to increase the performance of computational intensive tasks. However, most hardware accelerators lack of programming efficiency as they are programmed using not-so widely used languages like OpenCL, VHDL and HLS.

According to a survey from Databricks in 2016, 91% of the data scientists care mostly about the performance of their applications and 76% care about the easy of programming. Therefore, the most efficient way for the data scientists to utilize hardware accelerators like FPGAs, in order to speedup their application, is through the use of a library of IP cores that can be used to speedup the most computationally intensive part of the algorithm. Ideally, what most data scientists want is better performance, lower TCO and no need to change their code.

Accelerate your ML applications without changing your code

InAccel, a world leader is application acceleration, has released the new version of the Accelerated ML suite that allows data scientists to speedup the Machine learning applications without changing their code. InAccel offers a novel suite on AWS that can be used to speedup application for Apache Spark MLlib in the cloud (AWS) with zero-code changes. The provided platform is fully scalable and supports all the main new features of Apache Spark like pipeline and data Frames. For the data scientists that prefer to work with typical programming languages like C/C++, Java, Python and Scala, InAccel offers all the required APIs on AWS that allow the utilization of FPGAs in the cloud as simple as using a programming function.

Currently, InAccel offers two widely used algorithm for Machine learning training: Logistic Regression BGD and K-means clustering. Both of these algorithms were evaluated using the MNIST dataset (24 GBytes). The performance evaluation for 100 iterations showed that InAccel Accelerated ML suite for Apache Spark can achieve over 3x speedup for the machine learning and up to 2.5x overall (including the initialization of the Spark, the data extraction, etc.). The accelerators can be used both on cloud (AWS or Nimbix) and on-premise using the new FPGA card from Xilinx (Alveo).

While the loading of the data and the data extraction run much faster on r5.x12, due to the utilization of 48 cores, when it comes to machine learning that is the most computational intensive part, the FPGA-accelerated cores can achieve up to 3x speedup compared to the multi core.

Views: 1126

Tags: cloud, computing, learning, machine, ml, spark


You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Chris Kachris on February 24, 2019 at 10:27pm

The dataset refers to the largest MNIST 8M on SVM format (refers to 8 million data - 24GB)

you can find the dataset in the following link:

Comment by Thomas Loock on February 24, 2019 at 8:52am

The MNIST dataset is less than 20 Megabytes and not 24 GBytes.


About what dataset are you talking?


  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service