All Blog Posts (8,023)

50+ Free Data Science Books

Very interesting compilation published here, with a strong machine learning flavor (maybe machine learning book authors - usually academics - are more prone to making their books available for free). Many are O'Reilly books freely available. Here we display those most relevant to data science. I haven't checked all the sources, but they seem legit. If you find some issue, let us know in the…


Added by L.V. on September 19, 2015 at 9:00am — 5 Comments

Book: Creating Value with Big Data Analytics - Making Smart Marketing Decisions

Companies around the world are struggling with a vast amount of data, and can’t make sense of it all. Big Data has the promise of providing firms with significant new information about their markets, their products, their brands, and their customers – but currently, there’s often a great divide between big data and truly usable insights that create value for…


Added by Natasha Walk on September 18, 2015 at 12:00am — 1 Comment

The Real Reason for House Price Inflation in New Zealand

Money Money Money

The Real Reason for House Price Inflation in New Zealand

Simon Knudsen - Sept 2015

Population growth, shortages in housing supply, internal migration, immigration, cheap money, and foreign investors are just a few of the claimed causes of House Price Inflation (HPI) in New Zealand in recent years. The notorious example of HPI in action is NZ's largest city - Auckland. The city has experienced double-digit HPI of late…


Added by Simon Knudsen on September 17, 2015 at 5:13pm — 2 Comments

How Shell Uses Analytics To Drive Business Success

The oil and gas industries are facing major challenges - the costs of extraction are rising and the turbulent state of international politics adds to the difficulties of exploration and drilling for new reserves.

In the face of big problems, its key players are turning to Big Data in the hope of finding innovative solutions to these pressing issues.

Big Data is the name used to describe the theory and practice of applying advanced computer analysis to the ever-growing amount of…


Added by Bernard Marr on September 17, 2015 at 12:00pm — 1 Comment

How to Choose Between Learning Python or R First

If you're starting out in Data Science this is a good question to ask yourself.  After all you want to be immediately employable and also be efficient with your own time.

Cheng Hang Lee took on this question in an article by this same name earlier this year and has a fairly comprehensive discussion of the pros and cons.  Some highlights:

The Case for R

R has a long and trusted history and a robust supporting community in the data…


Added by William Vorhies on September 17, 2015 at 9:44am — 2 Comments

An introduction to Apache drill and why is it useful

With the rapid growth of data and the shift towards rapid development solutions much data is being stored in NoSQL stores such as Hadoop and MongoDB. The infrastructure built upon relational databases that have been used for decades cannot keep up with the volume and scope of data being captured. Further to this SQL is also a really good invention and method for extracting and analysing data that is very widely used.  In short it will not be replaced by hierarchical query techniques…


Added by Zygimantas Jacikevicius on September 17, 2015 at 6:06am — No Comments

No cost training to becoming a data scientist

Statistical analysis and data mining were the top skills that got people hired in 2014 based on LinkedIn analysis of 330 million LinkedIn member profiles. We live in an increasingly data-driven world, and businesses are aggressively hiring experts in data storage, retrieval, and analysis. Across the globe, statistics and data analysis skills were highly valued. In the US, India, and France, those skills are in particularly high demand.

What is data science?

Data scientist…


Added by Marina Mitrashov on September 17, 2015 at 3:30am — 1 Comment

15 Books every Data Scientist Should Read

With all this talk of terabytes and petabytes of digital information zipping around the world at the speed of light, it’s sometimes easy to forget about the humble book!

After all pretty much all you could ever practically need to know is probably conveniently available on a blog, Google Hangout or SlideShare presentation somewhere.

But to many of us, books are special – and whether you are so attached to the feel of turning paper pages between your fingers that you would never…


Added by Bernard Marr on September 16, 2015 at 5:00pm — 4 Comments

All Businesses Are Data Businesses

By Brad Kolarov, co-founder and Managing Partner at B23, a boutique Big Data and Cloud Computing software development and implementation company


Have you seen the Audi commercial with the autonomous mail delivery drones stalking the employees leaving their office? If not, its below.

It’s a bit of a stretch, but the commercial is pretty funny and a good example of the technological advances that are on the horizon.  To manage a fleet of autonomous mail delivery…


Added by Brad Kolarov on September 16, 2015 at 10:00am — 1 Comment

Weekly Digest, September 21

The weekly digest now has 6 sections: (1) Featured Articles and Case Studies, (2) Featured Resources and Technical Contributions, (3) From our Sponsors, (4) News, Events, Books, Training, Forum Questions, (5) Picture of the Week, and (6) Syndicated Content.

The full version is always published Monday. Starred articles are new additions or updated content, posted between Thursday and Sunday.…


Added by Vincent Granville on September 16, 2015 at 10:00am — No Comments

10 tools and platforms for data preparation

Traditional approaches to enterprise reporting, analysis and Business Intelligence such as Data Warehousing, upfront modelling and ETL have given way to new, more agile tools and ideas. Within this landscape Data Preparation tools have become very popular for good reason.  Data preparation has traditionally been a very manual task and consumed the bulk of most data project’s time.  Profiling data, standardising it and transforming it has traditionally been very manual and error…


Added by Zygimantas Jacikevicius on September 16, 2015 at 3:00am — 6 Comments

Why Bad Data is Wasting Your Marketing Efforts

The amount of money spent on marketing is growing, and the way we spend is changing. The statistics that prove this are plentiful, and becoming more convincing as the years go by.

Here are two very compelling examples:

  • By 2016, we expect the amount of money spent on digital marketing to consume 35% of total marketing budgets, …

Added by Martin Doyle on September 16, 2015 at 12:19am — No Comments

What is the most used feature in any business intelligence solution? It's the Export to Excel button.

I know this is an old joke in the BI community but I couldn't resist.  I was recently forwarded an article on the continued popularity of Excel in the BI community consisting of quotes from 27 experts saying how great and how relevant Excel remains.

We do categorize BI as static and historical as opposed to forward looking predictive analytics but I bet it's still true that Excel is a very widely used tool even by folks that…


Added by William Vorhies on September 15, 2015 at 8:51am — 3 Comments

Baby Steps

I need to start somewhere in my goal to become a Mad Data Scientist.  One of the first places I check was Coursera.org

For those of you who are not familiar with Coursera is a for-profit educational technology company that offers massive open online courses (MOOCs).  Coursera works with universities to make some of their courses available online, and offers courses in physics, engineering, humanities, medicine, biology, social…


Added by Jerry Smith on September 15, 2015 at 6:30am — 2 Comments

Applying Data Exploration & Discovery techniques to BCBS 239

Time is fast running out for G-SIBS (and indeed D-SIBS) to demonstrate compliance with the principles of BCBS 239.  Many surveys have been conducted by firms such as EY, McKinsey and Deloitte – none of which paint a particularly pretty picture in terms of readiness. Most suggest that the majority of banks will only be able to demonstrate compliance with between 25% and 60% of the listed principles by the January 2016 deadline. Why is this?

At a recent industry event discussing…


Added by James Phare on September 15, 2015 at 5:22am — No Comments

Analyse TB data using network analysis

Analyse TB data using network analysis


In a very interesting publication from Jose A. Dianes on tuberculosis (TB) cases per country it was shown that dimension reduction is achieved using Principal Component Analysis (PCA) and Cluster Analysis (…


Added by Tim Groot on September 15, 2015 at 4:00am — 1 Comment

7 Questions Every Data Scientist Should Be Answering for Business



Business professionals of all levels have asked me over the years what it is that they should know that their Data Science departments may not be telling them. To be candid, many Data Scientists operate in fear wondering what they should be doing as it relates to the business. In my judgment, the questions below address both…


Added by Damian Mingle on September 15, 2015 at 4:00am — 3 Comments

How Dark is your Data?

Are you running from one analysis to another? From one data visualization project, data modeling exercise, dashboard development, data quality analysis to another because there is high demand for your skills?

How big is your personal folder? Is it littered with spreadsheets, BI workbooks, and other scripts that were used for a one time…


Added by Isaac Sacolick on September 15, 2015 at 3:02am — 4 Comments

Basics of HDFS Architecture in Hadoop

The HDFS design is totally based on the design of the GFS (Google File System). Its implementation addresses a number of problems that are present in a number of distributed filesystems such as Network File System (NFS). Specifically, the implementation of HDFS address are:

 ➤ To be able to store a Big amount of data (petabytes), HDFS is designed to expansion the data across a large number of systems, and to support more larger file sizes as compared to distributed filesystems like…


Added by Priyanka Jain on September 15, 2015 at 2:08am — 1 Comment

When enough is enough: using data to understand how much we really spend with Medicare

I picked this topic for two reasons. First, I recently found out that CMS released their Part D Prescriber database for 2013. This dataset basically reports information on the drugs that individual physicians prescribed in 2013 under the Medicare Part D Program. Second, while talking to my colleagues at work I realized how much they wanted…


Added by Tatiana Sorokina on September 14, 2015 at 2:30pm — 2 Comments

Blog Topics by Tags

Monthly Archives













© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service