Featured Blog Posts – November 2016 Archive (85)

Analysis of 2 Million Hijacked Passwords (in Python)

Posted by Jianhua Li on GitHub. This was proposed as a data science project on Data Science Central, to challenge your data science skills on a real data set. Below is an overview. 

Basically one should try to answer the following three questions:

  • What are the most common patterns found in passwords?
  • Based on these…

Added by Emmanuelle Rieuf on November 30, 2016 at 7:00pm — 1 Comment

Solving the Data Science Mystery

Solving the Data Science Mystery

Data Science has become an inevitable charter in our everyday lives where every action of ours is measured, plotted, classified and logged. We leave traces of who we are while diving a car, when visiting a place, after watching a movie or shopping what we want. These traces of data captured…


Added by Prakash Pasupathy on November 30, 2016 at 11:00am — No Comments

R for SQListas (2): Forecasting the Future

R for SQListas, part 2

Welcome to part 2 of my “R for SQListas” series. Last time, it was all about how to get started with R if you’re a SQL girl (or guy)- and that basically meant an introduction to Hadley Wickham’s dplyr and the tidyverse. The logic being: Don’t fear, it’s not that different from what…


Added by Sigrid Keydana on November 29, 2016 at 11:30am — No Comments

Has AI Gone Too Far? - Automated Inference of Criminality Using Face Images

Summary:  This new study claims to be able to identify criminals based on their facial characteristics.  Even if the data science is good has AI pushed too far into areas of societal taboos?  This isn’t the first time data science has been restricted in favor of social goals, but this study may be a trip wire that starts a long and difficult discussion about the role of AI.


Has AI gone too far? This might seem like a nonsensical question to data…


Added by William Vorhies on November 29, 2016 at 10:28am — No Comments

Why so many Machine Learning Implementations Fail?

A recent article in Techcrunch describes Twitter and Facebook issues: algorithms unable to detect fake news or hate speech. I wrote about how machine learning could be improved, and what can make implementations under-perform -…


Added by Vincent Granville on November 28, 2016 at 7:30pm — No Comments

Difference Between Data Scientists, Data Engineers, and Software Engineers - According To LinkedIn

This article was posted by Ryan Swanstrom on Data Science 101. Ryan is helping the world learn data science at Microsoft.

The differences between Data Scientists, Data Engineers, and Software engineers can get a little confusing at times. Thus, here is a guest post provided by Jake Stein, CEO at Stitch formerly RJ Metrics, which aims to clear up some of that confusion based upon LinkedIn data.

As data grows, so does the expertise needed to manage it. The past few years…


Added by Emmanuelle Rieuf on November 28, 2016 at 6:00pm — No Comments

Your CRM data should reveal your future success (or demise)

Guest blog by Chris Rigatuso. Chris is Founder and Board Member at the Skyfollow Consulting Group. He earned his MBA from the Haas School of Business (UC Berkeley), and lives in the Bay Area. 

Your CRM data, is it the stairway to Heaven or…


Added by Emmanuelle Rieuf on November 28, 2016 at 5:30pm — No Comments

Machine learning as a service ? Might lose sleep over this !

    This post is 'not' intended to teach people how to use popular predictive modelling APIs for free. Although, to your surprise, this isn't a far fetched possibility. Trained Machine learning models are basically a function that maps feature vectors to the output variable. Upon querying with a test instance, the model predicts an outcome, assigning…


Added by Ashish kumar on November 28, 2016 at 5:00pm — No Comments

13 Great Blogs Posted in the last 12 Months

This is part of a new series of articles: once or twice a month, we post previous articles that were very popular when first published. These articles are at least 6 month old but no more than 12 month old. The previous digest in this series was posted here a while back. 

13 Great Blogs Posted in the last 12 Months…


Added by Vincent Granville on November 28, 2016 at 2:00pm — No Comments

Why Oxytocin, Dophamine & Adrenalin are key to creating engaging Data Products ?

Human behaviors, rituals & habits are the outcome of complex interplay of the environment and experiences they have been exposed to. These definitely play a big role in shaping our product interaction experience. All of us have intuitively understood the importance of "cognitive resonance" in the first 8 seconds we interact with a product and how that experience has subsequently shaped our outlook to our product. As…


Added by derick.jose on November 27, 2016 at 10:00pm — 1 Comment

Product recommendations in Digital Age

By 1994 the web has come to our doors bringing the power of online world at our doorsteps. Suddenly there was a way to buy things directly and efficiently online.

Then came eBay and Amazon in 1995....... Amazon started as bookstore and eBay as marketplace for sale of goods.

Since then, as Digital tsunami flooded, there are tons of websites selling…


Added by Sandeep Raut on November 27, 2016 at 2:00pm — No Comments

Weekly Digest, November 28

Monday newsletter published by Data Science Central. Previous editions can be found here.  The contribution flagged with a + is our selection for the picture of the week.

Featured Resources and Technical Contributions


Added by Vincent Granville on November 26, 2016 at 3:30pm — No Comments

Who Made the News? Text Analysis using R, in 7 steps

This post covers the following tasks using R programming:

  • cleans the texts,
  • sorts and aggregates by publisher names
  • creates word clouds and word…

Added by Ann Rajaram on November 26, 2016 at 3:30am — 5 Comments

Salary history and career path of a data scientist

While it is easy to find salary surveys for data scientists and related professions both at the junior and senior level, broken down per location and skills set, very few analyses show salary progress over the course of a 25 years career.…


Added by Vincent Granville on November 25, 2016 at 11:00pm — 29 Comments

Three Original Math and Proba Challenges, with Tutorial

Here I offer a few off-the-beaten-path interesting problems that you won't find in textbooks, data science camps, or in college classes. These problems range from applied maths, to statistics and computer science, and are aimed at getting the novice interested in a few core subjects that most data scientists master. The problems are described in simple English and don't require math / stats / probability knowledge beyond high school level.  My goal is to attract  people interested in data…


Added by Vincent Granville on November 24, 2016 at 10:30pm — No Comments

Machine Learning Models Predicting Dangerous Seismic Events

This article was written by Michal Tadeusiak


Underground mining poses a number of threats including fires, methane outbreaks or seismic tremors and bumps. An automatic system for…


Added by Emmanuelle Rieuf on November 23, 2016 at 7:00pm — No Comments

Python, Machine Learning, and Language Wars. A Highly Subjective Point of View

Guest blog by Sebastian RaschkaSebastian Raschka is the author of the bestselling book “Python Machine Learning.” As a Ph.D. candidate at Michigan State University, he is developing new computational methods in the field of computational biology. Sebastian has many years of experience with coding in…


Added by Vincent Granville on November 23, 2016 at 10:00am — 5 Comments

3 challenges when migrating to cloud-based project management software

If you were to ask people whose work revolves around business technology, they would probably describe this as the Times of the Great Migration. The migration in question is the one from the traditional desktop business software solutions to those that are cloud-based or provided as a service.



Added by Nate Vickery on November 23, 2016 at 12:00am — No Comments

The Experience of Being a High-Performing Data Scientist

Guest blog by James Kobielus. James is IBM's Big Data Evangelist. He is an industry veteran who spearheads IBM's thought leadership activities in big data, data science, enterprise data warehousing, advanced analytics, Hadoop, business intelligence, data management, and next best action technologies. Prior to joining IBM, he was a leading industry analyst, with firms including Forrester Research,…


Added by Vincent Granville on November 22, 2016 at 10:30am — No Comments

How to Set Up Data Science?

[This is a cross-post from d4t4science.com]

 1         Introduction

Setting up data science in corporate environments is a challenging task. Many companies struggle or even fail at…


Added by Philipp Diesinger on November 22, 2016 at 7:30am — 1 Comment

Featured Monthly Archives












© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service