They maintain 284 data sets as a service to the machine learning community.…Continue
Added by Mirko Krivanek on April 21, 2014 at 6:00pm — No Comments
Here we go. Enjoy the reading!
Illustration of YARN (from first article below)
Articles from external publishers and bloggers:
Added by Vincent Granville on April 21, 2014 at 12:08pm — No Comments
Analytics industry is heavily biased towards statistical techniques and data handling software. But this does not mean, if you are coming from a non statistical background, you cannot be an analytics champion. What differentiates a champion analyst from others is not statistical knowledge but the ability to apply the right statistical tool in the right business problem. Survival…Continue
Added by Tavish Srivastava on April 21, 2014 at 9:04am — No Comments
Managing performance of enterprise applications and achieving high levels of Performance with minimum resources is topic of discussion in today’s large enterprises. Resolving performance issues is essential for database administrators (DBAs) when it happens however it is best to react to the problems proactively. Proactive management requires very high level of attention and to help make sense of the overwhelming data provided by the database engine.
In database management being…Continue
Added by Muhammad Saeed on April 21, 2014 at 3:43am — No Comments
Based on the generic data type, esProc provides the sequence and the TSeq for implementing the complete set-lizing and the much more convenient relational queries.
The relation between the department and the employee is one-to-many and that between the employee and the SSN (Social Security…Continue
Added by Jim King on April 20, 2014 at 5:12pm — No Comments
Embodiment is comparable to the idea of an “ecosystemic” or “holistic” approach. In an ecosystem, each thing affects everything else. In light of the interrelationship, a person would not attempt to correct a problem by considering only a single piece of the puzzle. Instead, there is a need to bring together many aspects of the body. To understand embodiment, it is necessary to recognize how “the body” separates an organism from its environment; in a manner of speaking, the body represents…Continue
Added by Don Philip Faithful on April 19, 2014 at 7:30am — No Comments
In the modern era, business environment is changing rapidly. They are seeking for valuable business information as being essential assets which will not only lead organisation towards the path of success but also help to sustain business in a competitive environment. Business Intelligence (BI) is a model which relates managerial values, and a tool which is used in an organisation to handle and filter information in order to make healthy business decisions. It refers to the appropriate…Continue
Added by Avesh Dhakal on April 18, 2014 at 9:30pm — No Comments
Feel free to add your keywords. Here's a start:
Much has been written about customer churn - predicting who, when, and why customers will stop buying, and how (or…Continue
Added by Vincent Granville on April 17, 2014 at 4:00pm — No Comments
Added by Vincent Granville on April 17, 2014 at 3:30pm — No Comments
The Big Data Real-time Application is a scenario to return the computation and analysis results in real time even if there are huge amount of data. This is an emerging demand on database applications in recent years.
In the past, because there are not so many data, the computation is simple, and few parallelisms, the pressure on the database is not great. A high-end or middle-range database server or cluster can allocate…Continue
Added by Jim King on April 16, 2014 at 9:30pm — No Comments
Using data science, could you identify this profile? Explain the methodology that you used, and win $200. The first participant providing the correct answer will win the award. The solution will be posted here, once we have a winner. This profile was created as a test to check whether data science algorithms can successfully solve this type of problems.…
Should you hire someone who knows all the most recent flavors of logistic regression? Or an Hadoop developer?
In my opinion, this is the wrong strategy. These employees are very expensive (at least $120k per year), and they might not bring the ROI that you expect. At least, if going in that direction, hire someone favoring simple, scalable, robust, automated solutions over anything else. To automate, you need someone great at developing…Continue
Do you want to better understand Big Data and what it really means to businesses? It’s not just huge volumes and high velocity…there’s another important factor that provides an essential element to decision support and that’s variety of data. Regardless the data source, value creation lies the application of the right analytic approach to a given strategic endeavor.
Read the Big Data, Mining, and Analytics: Components of Strategic Decisions* book in…Continue
Now you can get all of the power, performance and productivity of Revolution R Enterprise on Amazon Web Services. Revolution R Enterprise 7 on AWS Marketplace includes:
Added by Gregory Todd on April 15, 2014 at 10:56am — No Comments
What is IOE? I=IBM, O=Oracle, and E=EMC. They represent the typical high-end database and data warehouse architecture. The high-end servers include HP, IBM, and Fujitsu, the high-end database software includes Teradata, Oracle, Greenplum; the high-end storages include EMC, Violin, and Fusion-io.
In the past, such typical high performance database architecture is the preference of large and middle sized organizations. They can run stably with superior performance, and became…
Added by Jim King on April 14, 2014 at 5:42pm — No Comments
Working in analytics industry, SAS has become an inevitable part of our lives. This article is the second part of the series we have published on SAS interview questions. These article will help you optimize your SAS routines/algorithms, make your codes efficient and follow best pratices for coding on SAS. We will also like to hear your solution on the problem statements in the article. Following are the description of the two parts of this series :
1. Part I :…Continue
Added by Tavish Srivastava on April 13, 2014 at 2:45am — No Comments
Recently I wrote about the "Top 10 Big Data Challenges – A Serious Look at 10 Big Data V’s", which summarizes some of the big issues associated with the deployment of big data projects. The use of the letter V may seem forced and contrived, but it is used primarily as a mnemonic device to label and recall these critical challenges, in much the same way the…Continue
I've heard from Wiley that our data science book had already 4,133 pre-ordered copies, which is (according to Wiley) a great start. It was published last Monday.
I invite you to check the final table of content or check out the book on…Continue
MongoDB performance tuning and scalability
Start with tuning the Operating System, follow ulimits and follow production notes per mongo manual. Try iostat, vmstat, mpstat, sar, free -tm for Linux. You can also try open…Continue
Added by Muhammad Saeed on April 11, 2014 at 8:35am — No Comments