Subscribe to DSC Newsletter

Marc Borowczak's Blog (4)

Migrating an Excel Spreadsheet Directly to HDFS and Spark 2.0.1 (Part 2)

Recently, in a previous post, we reviewed a path to leverage legacy Excel data and import CSV files thru MySQL into Spark 2.0.1. This may apply frequently in businesses where data retention did not always take the database route……

Continue

Added by Marc Borowczak on October 23, 2016 at 5:30am — No Comments

Expand Machine Learning Tools (Part2): Toree Scala and Python in Jupyter Notebook

Data Analytics favorite Apache Spark,  is progressing as a reference standard for Big Data, and a “fast and general engine for large-scale data processing”. In our previous post, we detailed how to expand ML tools using a PySpark kernel and leverage the …

Continue

Added by Marc Borowczak on June 9, 2016 at 10:30am — No Comments

Expand Machine Learning tools: Configure Jupyter/IPython notebook for PySpark 1.6.1

Data Analytics favorites include Apache Spark, which is becoming a reference standard for Big Data, as a “fast and general engine for large-scale data processing”. Its built-in PySpark interface can run as a Jupyter notebook, but recent posts didn’t quite allow me to do…

Continue

Added by Marc Borowczak on May 26, 2016 at 6:43am — No Comments

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service