Posted on DataScienceCentral, AnalyticBridge and BigDataNews over the last 2 years:Continue
Added by Vincent Granville on September 26, 2013 at 8:30am — No Comments
I found a very good link which explains about Big data , Hadoop fundamentals and Map Reduce in a very simple manner. Hope this will help everyone . https://www.youtube.com/watch?v=1jMR4cHBwZE
Email not displaying correctly? View it in your browser
Added by rahul sharma on September 26, 2013 at 2:10am — No Comments
List of events (webinars) hosted by Data Science Central since June 2012:Continue
Competitors at your heels? Customers churning? Predictions waiting to be made? No matter how big or complex your Machine Learning problem, Skytree can help.
Added by Vincent Granville on September 25, 2013 at 10:00am — No Comments
Text (word) analysis and tokenized text modeling always give a chill air around ears, specially when you are new to machine learning. Thanks to Python and its extended libraries for its warm support around text analytics and machine learning. Scikit-learn is a savior and excellent support in text processing when you also understand some of the concept like "Bag of word", "Clustering" and "vectorization". Vectorization is must-to-know technique for all machine leaning learners, text miner…Continue
Added by Manish Bhoge on September 25, 2013 at 9:47am — No Comments
One of the most popular methods or frameworks used by data scientists at the Rose Data Science Professional Practice Group is Random Forests. The…Continue
A new 191-page PDF eBook published by the National Academies of Sciences Press is available, "Frontiers in Massive Data Analysis," and can be downloaded for free (after free website registration):
The first 9 of the 10 chapters offer a comprehensive survey of state-of-the-art big data architectures, machine learning, and analysis techniques.
Chapter 10 really…Continue
Added by Michael Malak on September 23, 2013 at 9:35am — No Comments
Click on the image for full view
I just created a cloud at home. Suppose I start on a project aiming to create a computer game. I purchase 4 servers and some software. After a couple of weeks I realize that in order to complete the project I'll need 6 more servers, but I have run out of money. I decide to write an…Continue
Added by Fari Payandeh on September 22, 2013 at 3:51pm — No Comments
Added by Vincent Granville on September 19, 2013 at 3:30pm — No Comments
Interesting stats about types of weapons, locations, whether weapon was acquired legally, and gender of killer, etc. Would be nice to see trends on a chart.Continue
Who you like on Facebook might end up in your loan application being denied. This article was posted in MotherJones as Your Deadbeat Facebook Friends Could Cost You a Loan.…
The Balanced Scorecard (BSC) is a new buzz-word that stands for the performance management magic pill. Many books on management praise this business concept and report an impressive adoption rate by Fortune 1000 companies. When it comes to practice it appears that only top management in the company knows about the Balanced Scorecard and it is used at a minimum of its potential.
Balanced Scorecard and…Continue
Posted over the last 12 months or so. Includes not just charts and diagrams, but also video production, with source code.Continue
High Performance Computing (HPC) plus data science allows public and private organizations get…Continue
Over the last several months, as I looked at addressing the business needs across various industries as someone leading a team of Data Scientists, the question of domain expertise invariably cropped up.
Attending one meeting with a Pharmaceutical company, I was posed with the question of, "Have you done work in the areas of Rare Signal detection?" In a similar vein, while preparing for a meeting with an Auto finance major, the question was in the area of using Auto…
Added by Somjit Amrit on September 17, 2013 at 3:37am — No Comments
Bootstraps, Permutation Tests, and Sampling Orders of Magnitude Faster Using SAS, Computational Statistics-WIREs, Vol. 5, Issue 5, 391-405. Download @ http://www.datamineit.com/DMI_publications.htm
While permutation tests and bootstraps have very wide-ranging application, both share a common potential drawback: as data-intensive resampling methods, both can be runtime prohibitive when applied to large or even…Continue
Added by J.D. Opdyke on September 16, 2013 at 5:25am — No Comments
Click on the image for full view
Google recently replaced its AdWords MySql Database with a Database that they built in-house namely F1 Database. AdWords serves thousand of users, " which all share a database over 100TB serving up hundreds of thousands of requests per second, and runs SQL queries that scan…Continue