Very interesting compilation published here, with a strong machine learning flavor (maybe machine learning book authors - usually academics - are more prone to making their books available for free). Many are O'Reilly books freely available. Here we display those most relevant to data science. I haven't checked all the sources, but they seem legit. If you find some issue, let us know in the…Continue
Companies around the world are struggling with a vast amount of data, and can’t make sense of it all. Big Data has the promise of providing firms with significant new information about their markets, their products, their brands, and their customers – but currently, there’s often a great divide between big data and truly usable insights that create value for…Continue
Simon Knudsen - Sept 2015
Population growth, shortages in housing supply, internal migration, immigration, cheap money, and foreign investors are just a few of the claimed causes of House Price Inflation (HPI) in New Zealand in recent years. The notorious example of HPI in action is NZ's largest city - Auckland. The city has experienced double-digit HPI of late…Continue
The oil and gas industries are facing major challenges - the costs of extraction are rising and the turbulent state of international politics adds to the difficulties of exploration and drilling for new reserves.
In the face of big problems, its key players are turning to Big Data in the hope of finding innovative solutions to these pressing issues.
Big Data is the name used to describe the theory and practice of applying advanced computer analysis to the ever-growing amount of…Continue
If you're starting out in Data Science this is a good question to ask yourself. After all you want to be immediately employable and also be efficient with your own time.
Cheng Hang Lee took on this question in an article by this same name earlier this year and has a fairly comprehensive discussion of the pros and cons. Some highlights:
R has a long and trusted history and a robust supporting community in the data…Continue
With the rapid growth of data and the shift towards rapid development solutions much data is being stored in NoSQL stores such as Hadoop and MongoDB. The infrastructure built upon relational databases that have been used for decades cannot keep up with the volume and scope of data being captured. Further to this SQL is also a really good invention and method for extracting and analysing data that is very widely used. In short it will not be replaced by hierarchical query techniques…Continue
Added by Zygimantas Jacikevicius on September 17, 2015 at 6:06am — No Comments
Statistical analysis and data mining were the top skills that got people hired in 2014 based on LinkedIn analysis of 330 million LinkedIn member profiles. We live in an increasingly data-driven world, and businesses are aggressively hiring experts in data storage, retrieval, and analysis. Across the globe, statistics and data analysis skills were highly valued. In the US, India, and France, those skills are in particularly high demand.
With all this talk of terabytes and petabytes of digital information zipping around the world at the speed of light, it’s sometimes easy to forget about the humble book!
After all pretty much all you could ever practically need to know is probably conveniently available on a blog, Google Hangout or SlideShare presentation somewhere.
But to many of us, books are special – and whether you are so attached to the feel of turning paper pages between your fingers that you would never…Continue
By Brad Kolarov, co-founder and Managing Partner at B23, a boutique Big Data and Cloud Computing software development and implementation company
Have you seen the Audi commercial with the autonomous mail delivery drones stalking the employees leaving their office? If not, its below.
It’s a bit of a stretch, but the commercial is pretty funny and a good example of the technological advances that are on the horizon. To manage a fleet of autonomous mail delivery…Continue
The weekly digest now has 6 sections: (1) Featured Articles and Case Studies, (2) Featured Resources and Technical Contributions, (3) From our Sponsors, (4) News, Events, Books, Training, Forum Questions, (5) Picture of the Week, and (6) Syndicated Content.
The full version is always published Monday. Starred articles are new additions or updated content, posted between Thursday and Sunday.…Continue
Added by Vincent Granville on September 16, 2015 at 10:00am — No Comments
Traditional approaches to enterprise reporting, analysis and Business Intelligence such as Data Warehousing, upfront modelling and ETL have given way to new, more agile tools and ideas. Within this landscape Data Preparation tools have become very popular for good reason. Data preparation has traditionally been a very manual task and consumed the bulk of most data project’s time. Profiling data, standardising it and transforming it has traditionally been very manual and error…Continue
Here are two very compelling examples:
Added by Martin Doyle on September 16, 2015 at 12:19am — No Comments
I know this is an old joke in the BI community but I couldn't resist. I was recently forwarded an article on the continued popularity of Excel in the BI community consisting of quotes from 27 experts saying how great and how relevant Excel remains.
We do categorize BI as static and historical as opposed to forward looking predictive analytics but I bet it's still true that Excel is a very widely used tool even by folks that…Continue
I need to start somewhere in my goal to become a Mad Data Scientist. One of the first places I check was Coursera.org
For those of you who are not familiar with Coursera is a for-profit educational technology company that offers massive open online courses (MOOCs). Coursera works with universities to make some of their courses available online, and offers courses in physics, engineering, humanities, medicine, biology, social…Continue
Time is fast running out for G-SIBS (and indeed D-SIBS) to demonstrate compliance with the principles of BCBS 239. Many surveys have been conducted by firms such as EY, McKinsey and Deloitte – none of which paint a particularly pretty picture in terms of readiness. Most suggest that the majority of banks will only be able to demonstrate compliance with between 25% and 60% of the listed principles by the January 2016 deadline. Why is this?
At a recent industry event discussing…Continue
Added by James Phare on September 15, 2015 at 5:22am — No Comments
Analyse TB data using network analysis
In a very interesting publication from Jose A. Dianes on tuberculosis (TB) cases per country it was shown that dimension reduction is achieved using Principal Component Analysis (PCA) and Cluster Analysis (…Continue
Business professionals of all levels have asked me over the years what it is that they should know that their Data Science departments may not be telling them. To be candid, many Data Scientists operate in fear wondering what they should be doing as it relates to the business. In my judgment, the questions below address both…Continue
Are you running from one analysis to another? From one data visualization project, data modeling exercise, dashboard development, data quality analysis to another because there is high demand for your skills?
How big is your personal folder? Is it littered with spreadsheets, BI workbooks, and other scripts that were used for a one time…Continue
The HDFS design is totally based on the design of the GFS (Google File System). Its implementation addresses a number of problems that are present in a number of distributed filesystems such as Network File System (NFS). Specifically, the implementation of HDFS address are:
➤ To be able to store a Big amount of data (petabytes), HDFS is designed to expansion the data across a large number of systems, and to support more larger file sizes as compared to distributed filesystems like…Continue
I picked this topic for two reasons. First, I recently found out that CMS released their Part D Prescriber database for 2013. This dataset basically reports information on the drugs that individual physicians prescribed in 2013 under the Medicare Part D Program. Second, while talking to my colleagues at work I realized how much they wanted…Continue