Subscribe to DSC Newsletter

Featured Blog Posts – August 2017 Archive (112)

Book: R for Data Science

Learn how to use R to turn raw data into insight, knowledge, and understanding. This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data science as quickly as possible.…


Added by L.V. on August 21, 2017 at 10:00am — No Comments

How Big Is A Terabyte of Data

How Big Is a Terabyte of Data?

By JINAG Buxing

It seems that one mile distance isn’t long, and that a cubic mile isn’t that big if compared with the size of the earth. You may be surprised if I tell you the entire world’s population could all fit in a cubic mile of space. Hendrik Willem van Loon, a Dutch-American writer, once wrote the similar thing in one of his books.

Teradata is a famous provider of database…


Added by JIANG Buxing on August 20, 2017 at 9:30pm — No Comments

R Linear Regression

Basics of Linear Regression

Regression analysis is a statistical tool to determine relationships between different types of variables. Variables that remain unaffected by changes made in other variables are known as independent variables, also known as a predictor or explanatory variables while those that are affected are known as dependent variables also known as the response variable.

Linear regression is a statistical procedure…


Added by Shreya Gupta on August 20, 2017 at 9:00pm — No Comments

Robust Attacks on Machine Learning Models

This is a nightmare! Tadayoshi Kohno, Professor at Department of Computer Science and Engineering, University of Washington, manipulated a STOP sign in a typical "graffiti way", that it was recognized as 45 mph SPEED limit by typical AI software, such as built into a Tesla S. It's very likely, that it will become a sport to send Tesla drivers to hell. …


Added by Vincent Granville on August 20, 2017 at 2:00pm — No Comments

Why Deep Learning is Taking off? Season 1 : Part 1

First it was Machine Learning, and now all of a sudden Deep Learning is taking all the thunder even from Machine Learning. So what's the difference, and why all of a sudden Deep Learning has become the most buzzing new Technology of our Era? Is Deep Learning a false idol being Ubiquitously worshiped or is it the panacea which is…

Added by Ammar A. Raja on August 20, 2017 at 10:00am — 1 Comment

Are you drowning in Data Lake?


Added by Sandeep Raut on August 19, 2017 at 8:00am — No Comments

Data Science Simplified Part 8: Qualitative Variables in Regression Models

The last few blog posts of this series discussed regression models. Fernando has selected the best model. He has built a multivariate regression model. The model takes the following shape:

price = -55089.98 + 87.34 engineSize + 60.93…


Added by Pradeep Menon on August 19, 2017 at 6:30am — No Comments

Top Graphical Models Applications in Real World

1. Objective

Now we are going to explain the various Graphical Models Applications in real life such as – Manufacturing, finance, Steel Production, Handwriting Recognition etc. At last, we will discuss the case study about the use of Graphical Models in the Volkswagen.…


Added by Shreya Gupta on August 18, 2017 at 7:00pm — No Comments

Weekly Digest, August 21

Monday newsletter published by Data Science Central. Previous editions can be found here.  The contribution flagged with a + is our selection for the picture of the week.


  • Looking to further your experience and knowledge in business analytics, and set yourself apart from colleagues and competition? Michigan State University – ranked in the top 25 U.S.…

Added by Vincent Granville on August 18, 2017 at 5:30pm — No Comments

Machine Learning Vs. Statistics

This article was written by Aatash Shah.

Many people have this doubt, what’s the difference between statistics and machine learning? Is there something like machine learning vs. statistics?

From a traditional data…


Added by Amelia Matteson on August 18, 2017 at 11:30am — 1 Comment

Data Cleansing with Apache Spark and Optimus

Outdated, inaccurate, or duplicated data won’t drive optimal data driven solutions. When data is inaccurate, leads are harder to track and nurture, and insights may be flawed. The data on which you base your big data strategy must be accurate, up-to-date, as complete as possible, and should not contain duplicate entries. Clean data results in…


Added by Favio Vázquez on August 18, 2017 at 8:00am — No Comments

Top 25 Hadoop Interview Questions Prepared by Experts

1) Compare Hadoop & Spark


Criteria                                           Hadoop                                                   Spark

Dedicated storage                           HDFS                                                     None

Speed of processing                        average                                              …


Added by Venkatesan M on August 18, 2017 at 12:00am — 2 Comments

Real-Life Applications of Support Vector Machines

Applications of SVM in Real World 

SVMs depends on supervised learning algorithms. The aim of using SVM is to correctly classify unseen data. SVMs have a number of applications in several fields.

Some common applications of SVM are-

  • Face detection – SVMc classify parts of the image as a face and non-face and create a square boundary around the face.
  • Text and hypertext…

Added by Sheetal Sharma on August 17, 2017 at 11:30pm — No Comments

Contingency Tables in R

1. Objective

This R tutorial is all about Contingency tables in R. First of all, we will discuss the introduction to R Contingency tables, different ways to create Contingency tables in R. This tutorial also covers the Complex Tables in R / Flat Tables in R, Cross Tabulation in R, Recreating original data from contingency tables in R, and everything related to R contingency tables.…


Added by Shreya Gupta on August 17, 2017 at 7:00pm — No Comments

When AI (Artificial Intelligence) Goes Wrong...

The intelligence in AI is computational intelligence, and a better word could be Automated Intelligence. But when it comes to good judgment, AI is not smarter than the human brain that designed it. Many automated systems perform poorly, to the point that you are wondering if AI is an abbreviation for Artificial Innumeracy.

Critical systems - automated piloting, running a power plant - usually do well with AI and automation, as considerable testing is done before deploying these…


Added by Vincent Granville on August 17, 2017 at 11:30am — 4 Comments

Generative Adversarial Networks (GANs): Engine and Applications

Generative adversarial networks (GANs) are a class of neural networks that are used in unsupervised machine learning. They help to solve such tasks as image generation from descriptions, getting high resolution images from low resolution ones, predicting which drug…


Added by Luba Belokon on August 17, 2017 at 6:30am — No Comments

Data Lineage: The History of your Data

Data Denialism

A common scenario that data analysts in general encounter is what I like to describe as "data denialism". Often, and especially while consulting, an analyst will find that the data tells a different story than what the customer holds to be true. It is also often the case that, when presenting this finding, the customer will outright deny the evidence, asserting that either the data or the analysis must be wrong. For example, it may be that a retailer focused on the…


Added by Jesus Ramos on August 16, 2017 at 8:00am — No Comments

Data Science Simplified Part 7: Log-Log Regression Models

In the last few blog posts of this series, we discussed simple linear regression model. We discussed multivariate regression model and methods for selecting the right model.

Fernando has now created a better model.…


Added by Pradeep Menon on August 16, 2017 at 3:00am — No Comments

Windows Powershell Commands for Beginners


  • Introduction of Powershell
  • Need of powershell
  • BackGround of Powershell
  • Tools
  • Why its better than alternatives ?
  • Top Most Administrative Powershell Commands
  • Working with Pipeline
  • Selecting, Sorting, Measuring, Exporting, Importing, Converting,     Filtering, Passing Data in…

Added by Venkatesan M on August 15, 2017 at 8:30pm — No Comments

Neural Network Algorithms - Learn How To Train ANN

Top Neural Network Algorithms

Learning of neural network takes place on the basis of a sample of the population under study. During the course of learning, compare the value delivered by output unit with actual value. After that adjust the weights of all units so to improve the prediction.

There are many Neural Network Algorithms are available for training …


Added by Sheetal Sharma on August 15, 2017 at 7:00pm — No Comments

Featured Monthly Archives












  • Add Videos
  • View All

© 2020   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service