Subscribe to DSC Newsletter

Apache Spark on Azure HDInsight - My Talk @ Tampa Bay Data Science Group

Last week Microsoft has announced that Apache Spark on Azure HDInsight (Microsoft’s managed Hadoop and Spark cloud service) is now generally available. I spoke to Tampa Bay Data Science Group last night regarding Apache Spark on Azure HDInsight and the associated offerings. 

Spark for Azure HDInsight offers customers an enterprise-ready Spark solution that’s fully managed, secured, and highly available and made simpler for users with compelling and interactive experiences.

adnan-masood-data science

The slides from my presentation along with references to codebase and links are available as follows. 

Spark with Azure HDInsight - Tampa Bay Data Science

Apache Spark is an open source processing framework that runs large-scale data analytics applications. Built on an in-memory compute engine, Spark enables high performance querying on big data. It leverages a parallel data processing framework that persists data in-memory and disk if needed. This allows Spark to deliver 100x faster speed and a common execution model to various tasks like extract, transform, load (ETL), batch, interactive queries, and others on data in a Hadoop Distributed File System (HDFS). Azure makes Apache Spark easy and cost effective to deploy with no hardware to buy, no software to configure, a full notebook experience to author compelling narratives, and integration with partner business intelligence tools.


Views: 428


You need to be a member of Data Science Central to add comments!

Join Data Science Central


  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service