Subscribe to DSC Newsletter

July 2017 Blog Posts (85)

Statistical Modeling; Selecting Predictors is a Challenge for Data Scientists

For statistical models, selecting those predictors is what tests the steel of data scientists. It is really challenging to lay out the steps, as for every step, they should evaluate the situation and make decisions for the next or upcoming steps. It is a completely different story when running predictive models, and if relationship among the variables is not the main focus, situations get easier. Data analysts can go ahead to run step-wise regression models, empowering the data to give best…

Continue

Added by Chirag Shivalker on July 31, 2017 at 10:30pm — No Comments

Limitations of Hadoop – How to overcome Hadoop drawbacks

Hadoop – Introduction & features

Let us start with what is Hadoop and what are Hadoop features that make it so popular.

Hadoop is an open-source software framework for distributed storage and distributed processing of extremely large data sets. Important features of Hadoop are:

Hadoop is an open source project. It means its code can be modified to business requirements.

In Hadoop, data is highly available and…

Continue

Added by Sheetal Sharma on July 31, 2017 at 7:30pm — No Comments

The Best Metric to Measure Accuracy of Classification Models

This article was written by Jacob Joseph

Unlike evaluating the accuracy of models that predict a continuous or discrete dependent variable like Linear Regression models, evaluating the accuracy of a classification model could be more complex and time-consuming.Before measuring the accuracy of classification models, an analyst would first…

Continue

Added by Amelia Matteson on July 31, 2017 at 9:00am — No Comments

Crime Analytics: Visualization of Crime Incident Reports for Summar 2014 in San Francisco and Seattle

  1. In this assignment, some exploratory analysis is done on the criminal incident data from Seattle and San Francisco to visualize patterns and contrast and compare patterns across the two cities.
  2. Data used: The real crime dataset from Summer (June-Aug) 2014 for both of two US cities Seattle and San Francisco has been used for the analysis. The datasets used for…
Continue

Added by Sandipan Dey on July 31, 2017 at 4:00am — No Comments

Blockchain and Artificial Intelligence

Abstract – Blockchain is a mystery story or provides the foundation for cryptocurrencies like Bitcoin. What’s different about blockchains compared to traditional big-data distributed databases like MongoDB. Its like featuring a product that contains small blocks of brain in form of dust but consider that the innovation efforts of several publicly traded asset managers and banks are also on this brain block dust quest.  Computers start simulating the brain’s sensation,…

Continue

Added by Vinod Sharma on July 31, 2017 at 4:00am — No Comments

Big Data And Cloud Computing

1.Introduction to Big data and Cloud Computing

Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over a network (typically the Internet). It’s a virtualization framework.

It is like resource on demand whether it be storage, computing etc. Cloud follows pay per usage model. You need to pay the amount of resource you use.

This computing service by cloud charges…

Continue

Added by Shreya Gupta on July 30, 2017 at 8:00pm — No Comments

Self-Service Data Analytics for Smaller Companies (SME’s)

Let’s start with the bottom line - there is no excuse for virtually any company today, regardless of size or manpower (and within reason), not to be making data analyics a part of their normal business routines. Traditional objections such as cost, resources and expertise no longer cut the mustard.  As many observers have noted, a company’s internally generated data is a key asset that needs to be leveraged in the same way as any other corporate asset if the…

Continue

Added by Gregory Thompson on July 30, 2017 at 4:30pm — No Comments

How Customer Analytics has evolved...

Customer analytics has been one of hottest buzzwords for years. Few years back it was only marketing department’s monopoly carried out with limited volumes of customer data, which was stored in relational databases like Oracle or appliances like Teradata and Netezza. SAS & SPSS were the leaders in providing customer analytics but it was restricted to conducting segmentation of customers who are likely to buy your products or services. In the 90’s came web…

Continue

Added by Sandeep Raut on July 29, 2017 at 7:30pm — No Comments

Recasting Java neural networks in Python

Many neural network applications implemented in Java, such as Neuroph, Encog and Joone, may look rather different when switching from the Java language to Python with the help of the DMelt computing environment. First of all, they look simpler. You can use your favorite Python tricks to load and display data. The Python coding is simpler for viewing and fast modifications. It does not require recompiling after each change. At the same time, the platform…

Continue

Added by jwork.ORG on July 29, 2017 at 1:00pm — No Comments

Some Analysis with Astronomy data (in Python)

Data-Driven Astronomy

The following problems appeared as assignments in the coursera course Data-Driven Astronomy

 …

Continue

Added by Sandipan Dey on July 29, 2017 at 12:00pm — No Comments

Weekly Digest, July 31

Monday newsletter published by Data Science Central. Previous editions can be found here.  The contribution flagged with a + is our selection for the picture of the week.

Upcoming IoT Webinars
Continue

Added by Vincent Granville on July 29, 2017 at 11:00am — No Comments

Blockchain and Artificial Intelligence

Guest blog by Vinod Sharma.

Abstract – Blockchain is a mystery story or provides the foundation for cryptocurrencies like Bitcoin. What’s different about blockchains compared to traditional big-data distributed databases like MongoDB. Its like featuring a product that contains small blocks of brain in form of dust but consider that the innovation efforts of several publicly traded asset managers and banks are also on this brain block dust quest.  Computers…

Continue

Added by Vincent Granville on July 29, 2017 at 10:30am — No Comments

Image Analysis with Deep Learning

This article was contributed by Nikita Johnson. 





The cost of large scale data collection and annotation often makes the application of machine learning algorithms to…

Continue

Added by Emmanuelle Rieuf on July 28, 2017 at 6:00pm — No Comments

Recommendation System Algorithms

Today, many companies use big data to make super relevant recommendations and growth revenue. Among a variety of recommendation algorithms, data scientists need to choose the best one according a business’s limitations and requirements.

To simplify this task, my team has prepared an overview of the main existing recommendation system…

Continue

Added by Luba Belokon on July 28, 2017 at 4:00am — No Comments

Types of Machine Learning Algorithms in One Picture

Here is an interesting visualization of machine learning algorithms:

Originally posted here.  Also check out the following great visual summaries:…

Continue

Added by Vincent Granville on July 27, 2017 at 11:51am — 3 Comments

Thursday News: AI, NLP, Deep Learning, IoT, ML, Applications

Here is our selection of featured articles and resources posted since Monday:

Continue

Added by Vincent Granville on July 27, 2017 at 7:30am — No Comments

Ubuntu on AWS gets serious performance boost with AWS-tuned kernel

Canonical and Amazon Web Services have been working closely together to create the best experience of the world’s most popular cloud OS, on the world’s most popular public cloud. Official Ubuntu guest images have been available on AWS for years, and underlie the majority of workloads on the service—whether you use the EC2 Quickstart, Marketplace, or Lightsail. This week, and for the first time on the public cloud, Canonical, in collaboration with Amazon, is delighted to…

Continue

Added by Venkatesan M on July 26, 2017 at 8:30pm — No Comments

A Self-Study List for Data Engineers and Aspiring Data Architects

This article was written by John Hammink. John Hammink is an American engineer, musician, artist and linguist, with his own entry in Wikipedia. …

Continue

Added by Amelia Matteson on July 26, 2017 at 2:00pm — No Comments

12 Great Blogs Posted in the last 12 Months

This is part of a new series of articles: once or twice a month, we post previous articles that were very popular when first published. These articles are at least 6 month old but no more than 12 month old. The previous digest in this series was posted here a while back. 

12 Great Blogs Posted in the last 12…

Continue

Added by Vincent Granville on July 26, 2017 at 12:00pm — No Comments

Automated Machine Learning for Professionals

Summary:  There are a variety of new Automated Machine Learning (AML) platforms emerging that led us recently to ask if we’d be automated and unemployed any time soon.  In this article we’ll cover the “Professional AML tools”.  They require that you be fluent in R or Python which means that Citizen Data Scientists won’t be using them.  They also significantly enhance productivity and reduce the redundant and tedious work that’s part of model…

Continue

Added by William Vorhies on July 25, 2017 at 1:36pm — No Comments

Monthly Archives

2017

2016

2015

2014

2013

2012

2011

1999

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2017   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service