Here is our selection of featured articles and resources posted since Monday.
Added by Vincent Granville on July 19, 2018 at 7:30am — No Comments
Spark’s primary abstraction is a distributed collection of items called a Resilient Distributed Dataset (RDD). It is a fault-tolerant collection of elements which allows parallel operations upon itself. RDDs can be created from Hadoop InputFormats (such as…Continue
Added by Igor Bobriakov on July 17, 2018 at 11:07pm — No Comments
Hello, with this article I'm starting series of articles about full featured C++ Machine Learning frameworks . This articles covers how to use Shogun library for solving classification problem. Shogun is an open-source machine learning library that offers a wide range of machine learning algorithms. From my point of view it's not very popular among professionals, but it have a lot of fans among enthusiasts and…Continue
Added by Kyrylo Kolodiazhnyi on July 17, 2018 at 11:17am — No Comments
Summary: Getting an AI startup to scale for an IPO is currently elusive. Several different strategies are being discussed around the industry and here we talk about the horizontal strategy and the increasingly favored vertical strategy.
Added by William Vorhies on July 17, 2018 at 7:09am — No Comments
Pandas dataframe is making life a lot easier if you are working with data. A lot of data comes in CSV format. It's possible to read CSV files to a Pandas dataframe. In fact, it's quite easy using read_csv. In the video below we will learn the basics of just loading a CSV file to a Pandas dataframe object. …Continue
Added by Erik Marsja on July 17, 2018 at 6:01am — No Comments
Natural language processing (NLP) is getting very popular today, which became especially noticeable in the background of the deep learning development. NLP is a field of artificial intelligence aimed at understanding and…Continue
Added by Igor Bobriakov on July 17, 2018 at 3:03am — No Comments
The age of technology is way past its nascent stage and has grown exponentially during the last decade. In my role, travelling to events and meeting with thought leaders I am aware of the developments made across different technological fronts.…Continue
Added by Ronald van Loon on July 17, 2018 at 2:45am — No Comments
Autonomous cars are racing down the highway at speeds exceeding 100 MPH when suddenly a car a half-mile ahead blows out a tire sending dangerous debris across 3 lanes of traffic. Instead of relying upon sending this urgent, time-critical distress information to the world via the cloud, the cars on that particular section of the highway use peer-to-peer, immutable communications to inform all vehicles in the area of the danger so that they can slow down and move to unobstructed…Continue
My nephew's a very impressive young man. Five years ago, he received a PhD in Biochemistry/Molecular Biology from a prestigious university, earning numerous…
One of the most difficult and most critical parts of implementing data science in business is quantifying the return-on-investment or ROI. In this article, we highlight three reasons you need to learn the Expected Value Framework, a framework that connects the machine learning classification model to ROI.Continue
Added by Matt Dancho on July 16, 2018 at 9:42am — No Comments
Did you know that you can execute R and Python code remotely in SQL Server from Jupyter Notebooks or any IDE? Machine Learning Services in SQL Server eliminates the need to move data around. Instead of transferring large and sensitive data over the network or losing accuracy on ML training with sample csv files, you can have your R/Python code execute within your database. You can work in Jupyter Notebooks,…Continue
Added by Kyle Weller on July 16, 2018 at 8:31am — No Comments
Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week.
Featured Resources and Technical ContributionsContinue
Critically reading scientific papers is critical for Data Scientists working some areas - especially those working in health. With that in mind, here are some key considerations in reading scientific (peer-review, grey literature) papers:
Theory: Is the theory sound? Are there theoretical issues in the design that cause…Continue
Added by Howard Friedman on July 13, 2018 at 9:11am — No Comments
Added by Ziyad Nazem on July 13, 2018 at 8:08am — No Comments
Machine learning in finance may work magic, even though there is no magic behind it (well, maybe just a little bit). Still, the success of machine learning project depends more on building efficient infrastructure, collecting suitable datasets, and applying the right algorithms.
Machine learning is making significant inroads in the financial services industry. Let’s see why financial companies should care, what solutions they can implement with AI and machine learning, and how exactly…Continue
Added by Tetiana Boichenko on July 13, 2018 at 4:12am — No Comments
Apache Spark is a powerful open-source processing engine built around speed, ease of use, and sophisticated analytics. It has originally been developed at UC Berkeley in 2009, while Databricks was founded later by the creators of Spark in 2013.…Continue
Added by Igor Bobriakov on July 13, 2018 at 2:33am — No Comments
The best trained soldiers can’t fulfill their mission empty-handed. Data scientists have their own weapons — machine learning (ML) software. There is already a cornucopia of articles listing reliable machine learning tools with in-depth descriptions of their functionality. Our goal, however, was to get the feedback of industry experts.
And that’s why we interviewed data science practitioners — gurus, really —regarding the useful tools they…Continue
Added by Kateryna Lytvynova on July 13, 2018 at 2:00am — No Comments
Here is our selection of articles and technical contributions featured since Monday:
Added by Vincent Granville on July 12, 2018 at 9:00am — No Comments
In 2018 Fast Company declared the Data Scientist the best job for the third year in a row, which I wholeheartedly agree with (…Continue
The insurance industry is regarded as one of the most competitive and less predictable business spheres. It is instantly related to risk. Therefore, it has always been dependent on statistics. Nowadays, data science has changed this dependence forever.…Continue
Added by Igor Bobriakov on July 11, 2018 at 11:18pm — No Comments