The full version is always published Monday. Starred articles are new additions or updated content, posted between Thursday and Sunday. The picture of the week is from the contribution marked with a +, where you will find the details.
Added by Vincent Granville on September 9, 2015 at 9:30am — No Comments
Does this sound familiar?
Your organization is ready to develop more increasingly sophisticated analytics, but finds it difficult to get its data all in one place in a form that is usable. It is probably not a stretch to imagine this is poor quality data, filled with errors or incomplete. Also, there may be a reluctance to share data across the organization. The data may not be collected in a consistent way, or it might be locked in rigid data silos that are…Continue
Added by William Vorhies on September 9, 2015 at 7:20am — No Comments
Forbes magazine has been publishing the list of The World's Most Powerful People since 2009. The number of people in the list is proportional to the global population with the ratio being one slot for every 100 million people on Earth. When the list started in 2009, there were 67 people on the list and the latest list from year 2014 had 72 people. According for…Continue
As the amount of the data grows exponentially and terms such as big data become mainstream, the need to handle all this data has increased. Database skills have become some of the most sought-after in the job market today. SQL, which stands for Structured Query Language and is used for relational databases (RDBMS), is one of those skills. In fact,…Continue
Added by Divya Parmar on September 8, 2015 at 9:26am — No Comments
When you design a dashboard, your users are naturally going to be excited about all the things they can do with the new information. However, as soon as they answer the questions that motivated the creation of the system, they will think of new questions that they want to answer, often by going even deeper in the data. It's like they're following the yarn that you've given them to play with, and are…Continue
Added by Matt Ritter on September 8, 2015 at 5:32am — No Comments
So many data scientists select an analytic technique in hopes of achieving a magical solution, but in the end, the solution simply may not even be possible due to other limiting factors. It is important for organizations working with analytic capabilities to understand the various constraints of implementation most real-world applications…Continue
Added by Damian Mingle on September 7, 2015 at 4:00am — No Comments
Sentiment analysis is hard. Most of the systems on the market will clock anywhere around 55-65% for unseen data, even though they might be 85%+ accurate in their cross-validations.
A couple of reasons why creating a generic sentiment analyser is tough;
- There is too much variation in texts across domains, leading to different meanings
- Identifying sarcasm and combination of phrases like, 'not bad' is not equal to 'not' AND 'bad'
Despite years of criticism and negative publicity, Hedge funds have evolved as higher return generating machines. Thanks to all those amazingly weird Hedge Funds strategies. If you try to look at the overall picture, you will find that Hedge funds have now become a part of Wall Street’s eco-system.
Hedge funds strategies and hedge Funds in themselves have made headlines over the years due to various reasons. You will be awe struck when you find out what kinds of perks are given by…Continue
Added by rajesh dhnashire on September 6, 2015 at 9:00pm — No Comments
Here's one of the main differences between data engineering and data science: ETL (Extract / Load / Transform) is for data engineers, or sometimes data architects or DBA's.
DAD (Discover / Access / Distill) is for data scientists. Sometimes data engineers do DAD, sometimes data scientists do ETL, but it's rather rare, and when they do it, it's purely internal…Continue
Added by L.V. on September 6, 2015 at 3:30pm — No Comments
Here we compare statistics about two well known top data science websites, 2015 vs. 2013. The 2013 data can be found here. Below are the same stats for these two web properties, as of today. From a methodology point of view, comparing two (or more) websites on two different time periods is much better than comparing just one website on…Continue
Added by L.V. on September 5, 2015 at 3:30pm — No Comments
Here's a selection from Udacity's website. Initially, I intended to post questions from Google or Microsoft hiring managers and recruiters, but you can find these questions by doing a Google search, or…Continue
Added by L.V. on September 5, 2015 at 12:00pm — No Comments
Phenomenalism is sometimes described as a type of reductionism. Information about a complex object might be reduced to simple sensory details. For example, ignoring the many interesting features of the ice cream…Continue
Added by Don Philip Faithful on September 5, 2015 at 6:19am — No Comments
The internet, as part of the digital age, has forever changed the way we interact with the world. It has modified our perspective of everything that surround us. Now we have the ability to have immediate access to information, and as a result, we…Continue
Added by Jose Bautista on September 4, 2015 at 11:00am — No Comments
What if the computer algorithms could tell more compelling stories than journalists, writers or business analysts? Well, this is increasingly becoming a reality. A new generation of Big Data tools are being put to automate story telling.
Source for picture: …Continue
One of my favorite examples of why so many big data projects fail comes from a book that was written decades before “big data” was even conceived. In Douglas Adams’ The Hitchhiker’s Guide to the Galaxy, a race of creatures build a supercomputer to calculate the meaning of “life, the universe, and everything.” After hundreds of years of processing, the computer announces that the answer is “42.” When the beings protest, the computer calmly suggests that now they…Continue
Added by Bernard Marr on September 3, 2015 at 8:30pm — No Comments
And for software engineers or data analysts as well, in random order: