Subscribe to DSC Newsletter

All Blog Posts Tagged 'Scala' (3)

Building machine learning models in Apache Spark using SCALA in 6 steps

Introduction:

When dealing with building machine learning models, Data scientists spend most of the time on 2 main tasks when building machine learning models

Pre-processing and Cleaning

The major portion of time goes in to collecting, understanding, and analysing, cleaning the data and then building features. All the above steps mentioned are very important and critical to build successful machine learning…

Continue

Added by Rohit Walimbe on April 21, 2019 at 9:00pm — 1 Comment

Record linking with Apache Spark’s MLlib & GraphX

The challenge

Recently a colleague asked me to help her with a data problem, that seemed very straightforward at a glance. 

She had purchased a small set of data from the chamber of commerce (Kamer van Koophandel: KvK) that contained roughly 50k small sized companies (5–20FTE), which can be hard to find online.

She noticed that many of those companies share the same address,…

Continue

Added by Tom Lous on April 4, 2017 at 11:00pm — 5 Comments

Expand Machine Learning Tools (Part2): Toree Scala and Python in Jupyter Notebook

Data Analytics favorite Apache Spark,  is progressing as a reference standard for Big Data, and a “fast and general engine for large-scale data processing”. In our previous post, we detailed how to expand ML tools using a PySpark kernel and leverage the …

Continue

Added by Marc Borowczak on June 9, 2016 at 10:30am — No Comments

Monthly Archives

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

1999

Videos

  • Add Videos
  • View All

© 2020   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service

console.log("HostName");