Subscribe to Dr. Granville's Weekly Digest

All Blog Posts (1,162)

Weekly Digest, October 6

The full version is always published Monday. Starred articles or sections are new additions or updated content, posted between Thursday and Sunday. 



Added by Vincent Granville on October 1, 2014 at 3:00pm — No Comments

The end of the Data Scientist Bubble

This was the subject of a provocative article posted on Oracle's blog, two days ago. It certainly shows how far from the reality some big companies are. They confuse people who call themselves data scientists (or get assigned that job title), with those who are true data scientists, and might use a different job title. Many times, the issue is internal politics that create the…


Added by Mirko Krivanek on October 1, 2014 at 8:00am — 4 Comments

10 most popular data science presentations on Slideshare

These presentations have been viewed between 14,000 times (for #10) and 75,000 times (for #1 - see below), though pageview numbers are subject to manipulations (web robots etc.)

We will publish the "top 10 blogs on DataScienceCentral" (DSC), with a timestamp attached to each entry: old articles have obviously more pageviews than new ones (assuming the popularity is identical), and some articles get more than 50% of their traffic more than 3 months after being published. Indeed, it's a…


Added by Amy on September 30, 2014 at 12:00pm — 1 Comment

The 22 Skills of a Data Scientist

There has been a number of interesting articles recently, discussing the skills a data scientist should or might have. The one entitled The 22 Skills of a Data Scientist is a popular one (see 22 skills listed below, or click on the link to read the full article). Earlier this morning, I read another one on LinkedIn: …


Added by Vincent Granville on September 29, 2014 at 1:00pm — 5 Comments

Elements of machine learning

The official title of this free book available in PDF format is Machine Learning Cheat Sheet. But it's more about elements of machine learning, with a strong emphasis on classic statistical modeling, and rather theoretical - maybe something like a rather comprehensive, theoretical foundations (or handbook) of statistical science. Anyway, very interesting, and it's free. See table of content screenshot below. …


Added by Marcel Remon on September 29, 2014 at 9:30am — No Comments

More about Shifting Culture, Less about Investing in Potential

Data Science is often brought to companies as a potential game changer. An investment that may pay off if the company's data can be leveraged to provide insight and gain a competitive edge. But bringing analytical offerings to organizations as a "maybe solution" to their pain points misses the mark. Data science is today's answer to our most pressing enterprise and socially innovative challenges given the data-driven nature of our markets and society as a whole. If an investment in data…


Added by Sean McClure on September 29, 2014 at 9:03am — No Comments

New Beginnings in Facial Recognition

As humans, we navigate our lives largely by the recognition of patterns. These patterns include the sound of a mother’s voice, the appearance of a dangerous animal or poisonous food, the familiarity of kin, and the attraction to potential mates. Accurate pattern recognition is key to an animal’s survival and progress, and has allowed humans to become the socially complex and advanced species we are today. 

It should come as no surprise that…


Added by Sean McClure on September 29, 2014 at 9:01am — No Comments

Keeping Corporate Data Safe: 5 Ways Lax BYOD Policies Create Security Risks

The proliferation of smartphones, tablets, and other mobile devices — here come the “wearables” — has opened up new opportunities for businesses to leverage employee-owned technology for competitive advantage. That being said, the use of such devices in the workplace can compromise sensitive data, especially when comprehensive BYOD policies are not implemented and…


Added by Beau Winchester on September 28, 2014 at 11:00am — No Comments

Apache Spark: distributed data processing faster than Hadoop

This blog is extrapolated from DataScience Hacks by the author himself. 

Apache Spark, another apache licensed top-level project that could perform large scale data processing way faster than Hadoop (I am referring to MR1.0 here). It is possible due to Resilient Distributed Datasets concept that is behind this fast data processing. RDD is basically a collection of objects,…


Added by Pavan Kumar on September 28, 2014 at 7:00am — No Comments

Data Instrumentalism

Being the son of a mechanic, I have spent many years handling power tools. I'm especially fond of a couple of hammer-drills in my possession. They can effortlessly drill holes through concrete. At least, this is what my father once claimed. He handed down his most treasured tools to me. I'm big on pliers and screwdrivers. This might be due to my vocational training as a technician. Even today - long after I completed my diploma and continued to further my education - I still carry a licence…


Added by Don Philip Faithful on September 27, 2014 at 7:39am — No Comments

Great list of resources: data science, visualization, machine learning, big data

Fantastic resource created by Andrea Motosi. I've only included the 5 categories that are the most relevant to our audience, though it has 31 categories total, including a few on distributed systems and Hadoop. Click here to view the 31 categories. You might also want to check our our our internal resources (the first section below).…


Added by Amy on September 26, 2014 at 10:00am — 1 Comment

Decipher Neo4J Cypher Query Language (CQL)

This blog post is a follow up post to Embrace Relationships with Neo4J, R & Java

Neo4j Cypher is a declarative graph query language that allows for expressive and efficient querying and updating of the graph store. Cypher is a relatively simple but still very powerful language. Very complicated database queries can easily be expressed through Cypher. This allows…


Added by Raghavan Madabusi on September 26, 2014 at 2:17am — No Comments

Picks of the week: 33 great resources and articles found on the web

Starred articles were potential candidates for the picture of the week. Enjoy our new selection of articles and resources (R, data science, Python, machine learning etc.)


  1. Enhancing R with Advanced Compilation Tools and Methods (PDF)
  2. A new open-source package for…

Added by Amy on September 25, 2014 at 6:00pm — No Comments

100+ leading blogs for statisticians and like-minded professionals

The blogs bellow have their RSS feed integrated in the meta-blog. Anyone can submit their blog: we submitted ours, but haven't heard anything back. We are building a similar project of our own (we'll keep you posted); if you are interested in this project, let us know. Anyway, statsblogs clearly have some selection mechanism, and the blogs that their accepted are really worth reading, especially if you are a…


Added by Amy on September 25, 2014 at 5:00pm — No Comments

How Tracking Analytics Can Improve Content Marketing

Inbound and content marketing are not going anywhere anytime soon. The content marketing association reports that over 90% of both enterprise B2B and B2C companies are using the tactic. There are a million different ways to leverage content strategy, and here at TechnologyAdvice, we’ve experimented with plenty of them. It’s been a fun, albeit, educational experience to say the…


Added by Keith Cawley on September 25, 2014 at 4:59am — No Comments

Weekly Digest - September 29

The full version is always published Monday. Starred articles or sections are new additions or updated content, posted between Thursday and Sunday. 



Added by Vincent Granville on September 24, 2014 at 5:30pm — No Comments

43 Data Science Thought Leaders, According to Berkeley University

This is not a comprehensive list. Their data science lab was conducting a survey about defining big data. They asked many leading practitioners to provide a definition earlier this month, below are those who accepted / found the time to respond. The order in the Berkeley list below is alphabetical.



Added by Amy on September 24, 2014 at 10:00am — No Comments

50 Data Science and Statistics Blogs Worth Reading

This a selection of great websites (blogs) created by statisticians, data scientists, machine learning and other analytic professionals, thought emphasis is more on statistics than data science. I would add the following ones:


Added by Amy on September 24, 2014 at 9:00am — 1 Comment

20 Big Data Repositories You Should Check Out

This is an interesting listing created by Bernard Marr. I would add the following great sources:


Added by Amy on September 24, 2014 at 8:00am — No Comments

Blog Topics by Tags

Monthly Archives






Follow Us


  • Add Videos
  • View All

© 2014   Data Science Central

Badges  |  Report an Issue  |  Terms of Service