Fast forward transformation process in data science with Apache Spark Data Curation : Curation is a critical process in data science that helps to prepare data for featur...
This is part of a new series of articles: once or twice a month, we post previous articles that were very popular when first published. The previous digest in this series...
Summary: Got a good AUC on your hold out data? Think that proves that it’s safe to put the model into production. This article shows you some of the pitfalls in t...
In this day and age, you can’t go a day without hearing terms such as “data science,” “big data,” or “analytics.” These terms have been thrown around to app...
Contributed by Jielei (Emma) Zhu. She takes the NYC Data Science Academy 12 week full time Data Science Bootcamp program from July 5th to September 22nd, 2016. This...
Ever wondered why people have an affinity towards using certain apps vs others even though other apps provide the same functionality? Or, they are more attracted to certa...
Open source software solutions have become so powerful and large corporates started to prepare their traditional business analysts to move to open source softwares, parti...
In this note I will quickly talk about csv files on a basic scenario. I have loaded two csv files with customer complaints on my github account. The complaints are unique...
Data Frames are the tables to store data. If you recall the vectors from the first R notes data frames can be imagined as the collection of vectors with same dimension. W...
Key Features Put machine learning principles into practice to solve real-world problems Get to grips with Python’s impressive range of Machine Learning libraries an...