Big data and AI are joined at the hip: the best AI applications require massive amounts of constantly updated training data to build state-of-the-art models. Increasingly more Spark users want to integrate Spark with distributed machine learning frameworks built for state-of-the-art training.
Here’s the problem: big data frameworks like Spark and distributed deep learning frameworks don’t play well together due to the disparity between how big data jobs are executed and how deep learning jobs are executed.
In this latest Data Science Central webinar, we’ll share how Project Hydrogen, a Spark Project Improvement Proposal led by Databricks, is positioned as a potential solution to this dilemma.
We will cover:
Barrier execution mode for distributed DL training
Fast data exchange between Spark and DL frameworks, and
Xiangrui Meng, Software Engineer – Databricks
Bill Vorhies, Editorial Director – Data Science Central