Big Data holds a big promise. But has that promise paid out already? Or are you heading for Big Dollar Disaster? Many take inventory of their data and find out they have terabytes of data lying around. Surely something should be done with that, so here’s how we see a lot of companies going about implementing ‘something’ for their Big Data.
Added by Jos Verwoerd on November 13, 2012 at 2:08am — No Comments
How to use s3 (s3 native) as input / output for hadoop MapReduce job. In this tutorial we will first try to understand what is s3, difference between s3 and s3n and how to set s3n as Input and output for hadoop map reduce job. Configuring s3n as I/O may be useful for local map reduce jobs (ie MR run on local cluster), But It has significant importance when we run elastic map reduce job (ie when we run job on cloud). When we run job on cloud we need to specify storage location for input as…Continue
Added by Rahul Patodi on November 11, 2012 at 8:00am — No Comments
When creating a predictive model, data miners need to “tune” it to make the right kind of mistakes. Setting the cut-off point between ‘promising’ and ‘unpromising’ depends a lot on our client’s biggest concern -- missed opportunities or false alarms.
Data Mining Misconceptions #1: The 50/50 Problem…Continue
Added by Vincent Granville on November 8, 2012 at 1:49pm — No Comments
Predicting election results in the 50 states is actually much more easy than most people think. West Coast and East Coast are democrat, Midwest, Texas etc. are mostly republican (the Midwest becoming more republican because the population is aging due to brain drain by young, smart people - mostly democrats). So indeed the task is not about correctly predicting results for the 50 states, but simply predicting…Continue
Added by Vincent Granville on November 8, 2012 at 11:00am — No Comments
There is little time, about 3 or 4 years, if you wanted to process a large amount of textual data or web logs, you need to mobilize large servers and implement consistent SQL programs, long to be developed long and long to give results. Fortunately requests were few and generally volumes were measured at most in terabytes. Now e-commerce and social media have been largely developed, and many companies see their customer relationships, and therefore their survival, entirely dependent on the…Continue
Added by Michel Bruley on November 8, 2012 at 3:58am — No Comments
Hadoop (MapReduce where code is turned into map and reduce jobs, and Hadoop runs the jobs) is the most well known technology used for "Big Data" because it allows an organization to store huge quantities of data at very low…Continue
Added by Michael Walker on November 7, 2012 at 3:57pm — No Comments
Educating savvy and business-minded Indians on the importance of numbers and analytics in your business is like teaching the properties of sand to someone in the desert, but here is my effort anyway.
The simplest definition of analytics is "the science of analysis." However, a practical definition would be how an entity, e.g., a business, arrives at an optimal or realistic decision based on existing data. Business managers may choose to make decisions based on past experiences or…Continue
Added by AcademyForDecisionScience&Analyt on October 29, 2012 at 7:30pm — No Comments
Added by Vijay Kumar on November 2, 2012 at 8:41pm — No Comments
Big Data is a term used to categorize an excessive amount of aggregated data. But, how do Data Miners manage all of this data? Hadoop is one of the popular tools that data analysts are using to store and mine immense volumes of data.
Here are 5 Things a Data Analyst should know about Hadoop:
1. Hadoop utilizes parallel processing to store and process…Continue
Added by Ben Gold on November 2, 2012 at 9:21am — No Comments
Added by Vincent Granville on November 2, 2012 at 7:30am — No Comments
Added by Zach Piester on November 1, 2012 at 3:02pm — No Comments
Includes great charts from FlowingData.com.
Added by Vincent Granville on October 31, 2012 at 9:53am — No Comments
Many decisions, big and small are made every day across various lines of business, executive offices and departments in your organization. How are these decisions made today in your organization? What drives the decision maker(s) to say ‘yes’ or ‘no’ for any particular decision? Let’s look at some of the key business challenges that many organizations are facing today where decision making is critical.…Continue
Added by Srini Pagidyala on October 31, 2012 at 8:00am — No Comments
Added by Vincent Granville on October 25, 2012 at 7:10pm — No Comments
Analytics is not just pure science; it is part art as well. Organizations that master the fine art of using analytical tools realize increased revenues and enjoy cost savings.
The scientific approach involves the following four key steps:
Added by AcademyForDecisionScience&Analyt on October 29, 2012 at 7:50pm — No Comments
For motivated students who can learn on their own, here's an option that I would like to offer: the possibility to become an expert data scientist in less than six months, for a cost well below $10,000, and with guaranteed job opportunities.
The program would be open to everyone without screening, but the degree and the…Continue
Most analytics and data projects have started thinking of investing in big data initiatives. With so much buzz about big data, organizations have started investing or are thinking of investing in Hadoop While it is great to stay on top of trends, it often ends up being another investment where the full benefit and potential is simply not realized. The learning curve is too steep and the time to implement too high. Current analytics resources lack the strong programming skills required to…Continue
Added by Rahul Deshmukh on October 24, 2012 at 8:58am — No Comments