All Videos Tagged Spark R (Data Science Central) - Data Science Central 2020-06-03T03:08:03Z https://www.datasciencecentral.com/video/video/listTagged?tag=Spark+R&rss=yes&xn_auth=no Parallelize R Code Using Apache® Spark™ tag:www.datasciencecentral.com,2017-08-15:6448529:Video:607234 2017-08-15T23:37:42.031Z Tim Matteson https://www.datasciencecentral.com/profile/2edcolrgc4o4b <a href="https://www.datasciencecentral.com/video/parallelize-r-code-using-apache-spark"><br /> <img alt="Thumbnail" height="135" src="https://storage.ning.com/topology/rest/1.0/file/get/2781530416?profile=original&amp;width=240&amp;height=135" width="240"></img><br /> </a> <br></br>R is the latest language added to Apache Spark, and the SparkR API is slightly different from PySpark. SparkR’s evolving interface to Apache Spark offers a wide range of APIs and capabilities to Data Scientists and Statisticians. With the release of Spark 2.0, and subsequent releases, the R API officially supports executing user code on distributed data. This is… <a href="https://www.datasciencecentral.com/video/parallelize-r-code-using-apache-spark"><br /> <img src="https://storage.ning.com/topology/rest/1.0/file/get/2781530416?profile=original&amp;width=240&amp;height=135" width="240" height="135" alt="Thumbnail" /><br /> </a><br />R is the latest language added to Apache Spark, and the SparkR API is slightly different from PySpark. SparkR’s evolving interface to Apache Spark offers a wide range of APIs and capabilities to Data Scientists and Statisticians. With the release of Spark 2.0, and subsequent releases, the R API officially supports executing user code on distributed data. This is done primarily through a family of apply() functions.<br /> <br /> In this Data Science Central webinar, we will explore the following:<br /> <br /> ●Provide an overview of this new functionality in SparkR.<br /> <br /> ●Show how to use this API with some changes to regular code with dapply().<br /> <br /> ●Focus on how to correctly use this API to parallelize existing R packages.<br /> <br /> ●Consider performance and examine correctness when using the apply family of functions in SparkR.<br /> <br /> Speaker: Hossein Falaki, Software Engineer -- Databricks Inc.<br /> <br /> Hosted by: Bill Vorhies, Editorial Director -- Data Science Central