Very short time periods (6 months) with several crashes, as well as long time periods (3 years) with no crashes are expected. An even distribution of plane crashes is indeed NOT expected - it would look very suspicious, and definitely not random. Here we assume that all events (plane crashes) are independent. We also assume that the average is two major plane crashes per year - which is realistic if you include all passenger airlines flying anywhere in the world - sometimes in dangerous…Continue
Guest blog post by Mike Kennedy, from Talent Analytics.
The importance of finding the right employee from the start may seem like common sense, but is there a smarter way to reduce the time and money you spend hiring staff?
The answer is data, according to Greta Roberts, chief executive officer and…Continue
Added by Vincent Granville on July 31, 2014 at 9:30am — No Comments
I run into this question a lot and I have heard statisticians say things like we all do machine learning because none of us actually runs a regression or classification by hand on paper. We all use machine's.
On the other hand - some computer scientist's I talk to say that when you use programmatic techniques to orchestrate an analytical flow compared to using a GUI in SAS / SPSS you are using machine learning.
One more answer I have heard is…
Summary: If you have your finger on the company’s financial pulse you need to start thinking about these risks and benefits posed by Big Data.
Big data is the latest buzz phrase in business but what does it mean and why is knowledge of Big Data so important…Continue
The full version is always published Monday. Starred articles are new additions or updated content, posted between Thursday and Sunday. If you haven't received this message in your mailbox, read this notice.
Added by Vincent Granville on July 29, 2014 at 2:00pm — No Comments
Until recently, many organizations have been unable to harness the power of big data and analytics due to a number of technical obstacles. Many companies still don’t have anyone with the skill set or knowledge base required to manipulate data for discoveries and powerful insight, and many lack the funding necessary to invest in data analytics…Continue
Added by Jesse Jacobsen on July 29, 2014 at 8:55am — No Comments
by Elliott Cordo, chief architect at Caserta Concepts. Exclusive to Data Science Central.
There is much discussion these days about Lambda Architecture and its benefits for developing high performance analytic architectures. It offers a combination of a high performance, low latency ETL with a real-time layer, and a slower, more accurate, and flexible solution that runs in batch.
As I work…Continue
Added by Vincent Granville on July 29, 2014 at 8:00am — No Comments
Guest blog post by Dr. Livan Alonso.
Twitter has more than 250 million monthly active users who tweet more than 500 millions tweets per day. In the case someone is following many people, it isn’t feasible to read each tweet from them.
Do you have any idea about what people are twitting about Data Science, Big Data, Business Analytics, Hadoop, Machine Learning or R programming?
Word clouds are one of the simplest and most intuitive ways of visualizing text data.…Continue
I have been reading and collecting data science resources for years (back in the days when BI / BA was all the rage). While there are lots of resources on the net, not all are great and some are even misleading.
Now, I have updated my collection and placed them into a neat Trello list, open to all.
Of course it features some of the great and interesting articles from data science central:…
Added by Venky Rao on July 28, 2014 at 4:54am — No Comments
What are the differences between data science, data mining, machine learning, statistics, operations research, and so on?
Here I compare several analytic disciplines that overlap, to explain the differences and common denominators. Sometimes differences exist for nothing else other than historical reasons. Sometimes the differences are real and subtle. I also provide typical job titles, types of analyses, and industries traditionally attached to each discipline. Underlined domains are…Continue
Granted, the fear of public speaking is often considered the most common of all phobias. In some (non-scientific) studies, evidence suggests that people fear public speaking more than death itself. It is not unlikely that analysts fear public speaking even more – as they are often distinctly more introverted than your average Joe. As with death, the natural human reflex is to avoid such fearful events. But…Continue
Added by Geert Verstraeten on July 24, 2014 at 6:23pm — No Comments
Interesting blog post from a Berkely graduate (statistician) who was on the team that created Advil...
The guy graduated in the fifties, is very smart (wrote at least one great book that I bought, referenced in this article), but for whatever…Continue
Added by Mirko Krivanek on July 23, 2014 at 8:30pm — No Comments
The full version is always published Monday. Starred articles are new additions or updated content, posted between Thursday and Sunday.
Added by Vincent Granville on July 23, 2014 at 2:30pm — No Comments
How will Advanced & Predictive Analytics (APA) affect you? Our inaugural Advanced and Predictive Analytics report can help answer that question! APA is growing and developing a plan for your organization now is critical!
The 2014 Wisdom of Crowds® Advanced and Predictive Analytics market study contains everything you need to assess this dynamic market…Continue
Added by Michael Walker on July 23, 2014 at 2:30pm — No Comments
On December 13th, 2013, a blog devoted to IT security news broke a startling story — Target, one of the country’s largest big-box retailers, had been the victim of a security breach that exposed the credit card data of thousands of shoppers.
The attackers targeted the data stored in the magnetic strips of customers’ cards. The website reported…
Added by Beau Winchester on July 23, 2014 at 11:24am — No Comments
I tried to put 10 step process for big data projects. Pls correct or suggest any addition/change. Your inputs are highly appreciated.
1. You really need to have a problem(s) for which you are not able to find the solution directly with your existing metrics and reports.
2. You can have any size of data, however if it is small then you don't need to build any complex model around it. Its good if the size is good enough. Never use just a sample of…
The industrial revolution of the 1800s established the building blocks of Manufacturing as we know it today. Man, Machine, Material and Method were connected together to form an intricate system on which manufacturing processes and its operational dynamics were based. The resulting complexity of such a system however, has resulted in ineptitudes which have become difficult to…Continue
Added by Sumit Prasad on July 23, 2014 at 1:00am — No Comments
The new European laws about "the right to be forgotten", however absurd they might be, is a new government threat for data engines.
Added by Mirko Krivanek on July 22, 2014 at 7:30pm — No Comments
“Big data” isn’t just a trendy buzzword and it’s not some revolutionary concept. It’s exactly what it sounds like: Large amounts of data that may be beneficial to a company’s marketing endeavors by helping them understand their demographics better. Big data can come from numerous sources, both internally and externally, and can include things like customer surveys, massive surveys such as the Census, and even an email list or analytics report from a social media…Continue