All Blog Posts

### The Next Big Thing in AI/ML is…

Summary:  AI/ML itself is the next big thing for many fields if you’re on the outside looking in.  But if you’re a data scientist it’s possible to see those advancements that will propel AI/ML to its next phase of utility.

“The Next Big Thing in AI/ML is…” as the lead to an article is probably the most…

Added by William Vorhies on October 21, 2019 at 9:18am — No Comments

### Weekly Digest, October 21

Monday newsletter published by Data Science Central. Previous editions can be found here.

Announcement

• The true…
Added by Vincent Granville on October 20, 2019 at 4:30pm — No Comments

### A short introduction to Log Models

Why do we take logs of variable in Regression analysis?

We should remember that a regression equation has two parts

i) The Dependent variable (Predictand)

ii) The Independent variables (Predictors) ; which can be one or more and can be of different types (Categorical or Continuous).

The nature of the regression that we should run depends on the type of Dependent variable that we are dealing with in our model. For example, if the dependent variable is Continuous…

Added by Sibashis Chakraborty on October 20, 2019 at 8:57am — No Comments

### Why Every Data Scientist Needs A Data Engineer

This article was written by Laurel Brunk.

Data scientists spend most of their time (up to 79%!) on the part of their job they hate most.

The Role of…

Added by Andrea Manero-Bastin on October 20, 2019 at 2:00am — 1 Comment

### P-Value Explained in One Picture

P-values ("Probability values") are one way to test if the result from an experiment is statistically significant. This picture is a visual aid to p-values, using a theoretical experiment for a pizza business.…

Added by Stephanie Glen on October 18, 2019 at 8:47am — 2 Comments

### Building an exoplanet detection model using TensorFlow's prebuilt estimator for gradient boosting trees

In this post we will talk about the Kepler dataset from Kaggle competitions and use it to build an exoplanet detection model using TensorFlow's prebuilt estimator for gradient boosting trees known as the BoostedTreesClassifier.

## Detecting exoplanets in outer space

For the project explained in this post, we use the Kepler labeled time series data from Kaggle. This dataset is derived mainly from the Campaign 3 observations of the mission by…

Added by Packt Publishing on October 18, 2019 at 1:30am — No Comments

### What is the role of CDO, How Do They Help in Structuring Data Functions?

The explosion of data in the world – right from the data collected from the cameras to the data gathered from visitors’ actions on websites – is staggering. With new types of data pouring in and the applications of data analysis becoming vast, companies need to regulate the unprecedented data.The explosion of data in the world – right from the data collected from the cameras to the data gathered from visitors’ actions on websites – is staggering. With new types of data…

Added by Divyesh Aegis on October 18, 2019 at 1:00am — No Comments

### 5 Common Issues that Wreck Database Security and How to Solve them

With data growing at its highest rate ever, cyberattacks and digital warfare are on the rise to get hold of any crucial data. The malicious actors primarily target the data in organizations; if it’s important to you, so it is to them.

Cybercriminals often target databases since they mostly store sensitive data — customer data, financial…

Added by Evan Morris on October 17, 2019 at 7:19pm — No Comments

### How Xpath Plays Vital Role In Web Scraping

XPath is a language for finding information in structured documents like XML or HTML. You can say that XPath is (sort of) SQL for XML or HTML files. XPath is used to navigate through elements and attributes in an XML or HTML document.

To understand XPath we must be clear about elements and nodes which are the building blocks of XML and HTML. Let’s talk about them. Here is an example element in an HTML…

Added by Sandra Moraes on October 17, 2019 at 7:00pm — No Comments

### Thursday News, October 17

Here is our selection of featured articles and technical resources posted since Monday:

Announcement

Resources

Added by Vincent Granville on October 17, 2019 at 9:00am — No Comments

### RADR: A Powerful Visual Risk Analytics Tool

by Vic Diloreto, Director, Software Products, Elder Research

The Risk Assessment Data Repository (RADR) is a powerful risk analytics platform used to enhance productivity in the investigation of fraud, waste and abuse. This server-based, data analytics product  fuses data from…

Added by Paul Derstine on October 17, 2019 at 5:01am — No Comments

### Simulated Stock Market: Original Gaming App Played with Real Money

I wrote earlier in 2019 an article entitled New Stock Trading and Lottery Game Rooted in Deep Math, see here. It features a number guessing game that -- depending on the parameters -- mimics either a neutral stock market or a lottery. The gain depends on the distance between your guess and the winning numbers. The average gain is zero, and…

Added by Vincent Granville on October 16, 2019 at 7:30am — No Comments

### Federated Machine Learning - Collaborative Machine Learning without Centralised Training Data

Like a failed communist state traditional machine learning centralises training of a model on a single machine. Centralising data in a single central location is not always possible for a variety of reasons such as slow network connections, and legal constraints. These…

Added by Brett Drury on October 16, 2019 at 2:00am — No Comments

### Smart Transportation System: Boon of IoT to the Transportation Industry

The whole world is evolving at the lightening speed due to technological advancement. Most of the business sectors have opted for one or other technological solutions for operating their business smoothly and also for earning huge profits. The advent of advanced technologies like…

Added by Shady Johnson on October 16, 2019 at 1:30am — No Comments

### Visually Explained: Human Vs. Artificial Intelligence - Who Wins The Race?

Human and artificial intelligence compares just as well as oranges and apples do. Nonetheless, the broader public does precisely that, including a vast portion of businesses and organizations. Hence, let's do a thought experiment: If we were to compare human and artificial intelligence, how would we go about it? And what would be the possible conclusions from that comparison?…

Added by Rafael Knuth on October 14, 2019 at 9:30am — No Comments

### Surprise – Model Improvements Don’t Always Drive Business Impact

Summary:  Data Scientists from Booking.com share many lessons learned in the process of constantly improving their sophisticated ML models.  Not the least of which is that improving your models doesn’t always lead to improving business outcomes.

The adoption of AI/ML in business is at an inflection point. …

Added by William Vorhies on October 14, 2019 at 9:28am — 1 Comment

### Does Your Hypothesis Development Canvas Tell a Story?

As Kevin Lynch, CTO of The Information Lab in Dublin, was describing how his organization uses the Hypothesis Development Canvas, it occurred to me that Kevin was actually using the canvas to tell a story about how his organization uses data science to uncover new sources of value (see Figure 1).…

Added by Bill Schmarzo on October 14, 2019 at 5:00am — No Comments

### What is Data Science?

Data Science continues to be a hot topic among skilled professionals and organizations that are focusing on collecting data and drawing meaningful insights out of it to aid business growth. A lot of data is an asset to any organization, but only if it is processed efficiently. The need for storage grew multifold when we entered the age of big data. Until 2010, the major focus was towards building a state of the art infrastructure to…

Added by Priyansha Kansal on October 14, 2019 at 12:24am — No Comments

### Weekly Digest, October 14

Monday newsletter published by Data Science Central. Previous editions can be found here.

Announcement…

Added by Vincent Granville on October 13, 2019 at 11:00am — No Comments

### 40+ Modern Tutorials Covering All Aspects of Machine Learning

This list of lists contains books, notebooks, presentations, cheat sheets, and tutorials covering all aspects of data science, machine learning, deep learning, statistics, math, and more, with most documents featuring Python or R code and numerous illustrations or case studies. All this material is available for free, and consists of content mostly created in 2019 and 2018, by various top experts in their respective fields. A few of these documents are available on LinkedIn: see last section…

Added by Capri Granville on October 12, 2019 at 7:30am — 1 Comment

