Subscribe to DSC Newsletter

Apache Spark is generally known as a fast, general and open-source engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing. It allows you to speed analytic applications up to 100 times faster compared to technologies on the market today. You can interface Spark with Python through "PySpark". This is the Spark Python API exposes the Spark programming model to Python. 

The cheat sheet below was produced by DataCamp. You can find the original version (PDF format) here. Zoom in on the picture below, by clicking on it. 

You can find many more cheat sheets, covering all data science topics, by clicking here

DSC Resources

Popular Articles

Views: 11054

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2017   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service