Subscribe to DSC Newsletter

All Blog Posts (5,175)

Thursday News: Blockchain, AI, NLP, Python, R, SQL, Spark, Regression...

Here is our selection of featured articles and resources posted since Monday.



Added by Vincent Granville on July 19, 2018 at 7:30am — No Comments

Practical Apache Spark in 10 minutes. Part 2 - RDD

Spark’s primary abstraction is a distributed collection of items called a Resilient Distributed Dataset (RDD). It is a fault-tolerant collection of elements which allows parallel operations upon itself. RDDs can be created from Hadoop InputFormats (such as…


Added by Igor Bobriakov on July 17, 2018 at 11:07pm — No Comments

Machine Learning with C++ - Classification with Shogun library

Hello, with this article I'm starting series of articles about full featured C++ Machine Learning frameworks . This articles covers how to use Shogun library for solving classification problem. Shogun is an open-source machine learning library that offers a wide range of machine learning algorithms. From my point of view it's not very popular among professionals, but it have a lot of fans among enthusiasts and…


Added by Kyrylo Kolodiazhnyi on July 17, 2018 at 11:17am — No Comments

Comparing AI Strategies – Vertical vs. Horizontal

Summary:  Getting an AI startup to scale for an IPO is currently elusive.  Several different strategies are being discussed around the industry and here we talk about the horizontal strategy and the increasingly favored vertical strategy.


Looks like there’s a…


Added by William Vorhies on July 17, 2018 at 7:09am — No Comments

A Basic Introduction to Reading and Writing CSV Files using Pandas

Pandas dataframe is making life a lot easier if you are working with data. A lot of data comes in CSV format. It's possible to read CSV files to a Pandas dataframe. In fact, it's quite easy using read_csv. In the video below we will learn the basics of just loading a CSV file to a Pandas dataframe object. …


Added by Erik Marsja on July 17, 2018 at 6:01am — No Comments

Comparison of Top 6 Python NLP Libraries

Natural language processing (NLP) is getting very popular today, which became especially noticeable in the background of the deep learning development. NLP is a field of artificial intelligence aimed at understanding and…


Added by Igor Bobriakov on July 17, 2018 at 3:03am — No Comments

Digital Meets 5G; Shaping the CxO Agenda

The age of technology is way past its nascent stage and has grown exponentially during the last decade. In my role, travelling to events and meeting with thought leaders I am aware of the developments made across different technological fronts.…


Added by Ronald van Loon on July 17, 2018 at 2:45am — No Comments

Blockchain + Analytics: Enabling Smart IOT

Autonomous cars are racing down the highway at speeds exceeding 100 MPH when suddenly a car a half-mile ahead blows out a tire sending dangerous debris across 3 lanes of traffic.  Instead of relying upon sending this urgent, time-critical distress information to the world via the cloud, the cars on that particular section of the highway use peer-to-peer, immutable communications to inform all vehicles in the area of the danger so that they can slow down and move to unobstructed…


Added by Bill Schmarzo on July 16, 2018 at 3:47pm — 3 Comments

a Little SQL with a Little R

My nephew's a very impressive young man. Five years ago, he received a PhD in Biochemistry/Molecular Biology from a prestigious university, earning numerous…


Added by steve miller on July 16, 2018 at 12:02pm — 1 Comment

3 Reasons to Learn the Expected Value Framework for Data Analysis

One of the most difficult and most critical parts of implementing data science in business is quantifying the return-on-investment or ROI.  In this article, we highlight three reasons you need to learn the Expected Value Framework, a framework that connects the machine learning classification model to ROI.



Added by Matt Dancho on July 16, 2018 at 9:42am — No Comments

Remotely Send R and Python Execution to SQL Server from Jupyter Notebooks



Did you know that you can execute R and Python code remotely in SQL Server from Jupyter Notebooks or any IDE? Machine Learning Services in SQL Server eliminates the need to move data around. Instead of transferring large and sensitive data over the network or losing accuracy on ML training with sample csv files, you can have your R/Python code execute within your database. You can work in Jupyter Notebooks,…


Added by Kyle Weller on July 16, 2018 at 8:31am — No Comments

Weekly Digest, July 16

Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week.

Featured Resources and Technical Contributions


Added by Vincent Granville on July 15, 2018 at 9:30am — 1 Comment

Critically Reading Scientific Papers

Critically reading scientific papers is critical for Data Scientists working some areas - especially those working in health. With that in mind, here are some key considerations in reading scientific (peer-review, grey literature) papers:

Theory: Is the theory sound? Are there theoretical issues in the design that cause…


Added by Howard Friedman on July 13, 2018 at 9:11am — No Comments

The art of data science...

In 2018, Fast Company declared ‘Data Scientist’ as the best job in America for the third…


Added by Ziyad Nazem on July 13, 2018 at 8:08am — No Comments

Machine learning in finance: Why, what & how

Machine learning in finance may work magic, even though there is no magic behind it (well, maybe just a little bit). Still, the success of machine learning project depends more on building efficient infrastructure, collecting suitable datasets, and applying the right algorithms.

Machine learning is making significant inroads in the financial services industry. Let’s see why financial companies should care, what solutions they can implement with AI and machine learning, and how exactly…


Added by Tetiana Boichenko on July 13, 2018 at 4:12am — No Comments

Practical Apache Spark in 10 minutes. Part 1 - Ubuntu installation

Apache Spark is a powerful open-source processing engine built around speed, ease of use, and sophisticated analytics. It has originally been developed at UC Berkeley in 2009, while Databricks was founded later by the creators of Spark in 2013.…


Added by Igor Bobriakov on July 13, 2018 at 2:33am — No Comments

Best Machine Learning Tools: Experts’ Top Picks

The best trained soldiers can’t fulfill their mission empty-handed. Data scientists have their own weapons  machine learning (ML) software. There is already a cornucopia of articles listing reliable machine learning tools with in-depth descriptions of their functionality. Our goal, however, was to get the feedback of industry experts.

And that’s why we interviewed data science practitioners — gurus, really —regarding the useful tools they…


Added by Kateryna Lytvynova on July 13, 2018 at 2:00am — No Comments

Thursday News: NLP, AI, Deep Learning, Sensor Data, Death of the Data Scientist, DataViz

Here is our selection of articles and technical contributions featured since Monday:

Technical Contributions


Added by Vincent Granville on July 12, 2018 at 9:00am — No Comments

The Death of the Data Scientist

In 2018 Fast Company declared the Data Scientist the best job for the third year in a row, which I wholeheartedly agree with (…


Added by Matt Tucker on July 12, 2018 at 2:30am — 16 Comments

Top 10 Data Science Use Cases in Insurance

The insurance industry is regarded as one of the most competitive and less predictable business spheres. It is instantly related to risk. Therefore, it has always been dependent on statistics. Nowadays, data science has changed this dependence forever.…


Added by Igor Bobriakov on July 11, 2018 at 11:18pm — No Comments

Monthly Archives










Follow Us


  • Add Videos
  • View All


© 2018   Data Science Central™   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service