Subscribe to DSC Newsletter

All Blog Posts Tagged 'MapReduce' (10)

Hadoop for Beginners - Part 2

Hadoop - MapReduce in an easy way

In the previous blog, we discussed about HDFS, one of the main components of Hadoop. I highly recommend going through that blog before moving onto MapReduce. This blog will introduce you to MapReduce, which is…

Continue

Added by Aafrin Dabhoiwala on September 2, 2018 at 8:30am — No Comments

Hadoop VS Spark: Which is the best Data Analytics engine?

In the book Hadoop: The definitive guide, Tom white quotes Grace Hopper, “In pioneer days they used oxen for heavy pulling, and when one ox couldn’t budge a log, they didn’t try to grow a larger ox. We shouldn’t be trying for bigger computers, but for more systems of computers.” For long Hadoop has been the data analytics system preferred by businesses all over. The recent entry of the spark engine has however given businesses an option other than Hadoop for data analytics…

Continue

Added by Tanmay Bhandari on June 7, 2016 at 7:29pm — No Comments

The Hadoop Ecosystem: HDFS, Yarn, Hive, Pig, HBase and growing...

Hadoop is the leading open-source software framework developed for scalable, reliable and distributed computing. With the world producing data in the zettabyte range there is a growing need for cheap, scalable, reliable and fast computing to process and make sense of all of this data. The underlying technology for Hadoop framework was created by Google as there…

Continue

Added by Zygimantas Jacikevicius on November 25, 2015 at 1:20am — 4 Comments

Clustering Similar Images Using MapReduce Style Feature Extraction with C# and R

Abstract



This article provides a full demo application using both the C# and R programming languages interchangeably to rapidly identify and cluster similar images.   The demo application includes a directory with 687 screenshots of webpages.  Many of these images are very similar with different domain names but near identical content.  Some images are only slightly similar with the sites using the same general layouts but different colors and different images on certain…

Continue

Added by Jake Drew Ph.D. on June 25, 2014 at 4:00pm — No Comments

Building a house and building a data-analytic model

If I want to build a house, wouldn't it be wise to learn carpentry? Does the analogy hold for data-analytic multivariate models? Or is it simply enough to let a machine do it, with no knowledge by the machine operator of how to interpret the results from those modeling efforts? Or is it true, as one person has recently asserted, that he could replicate ALL statistical procedures and techniques using MapReduce, without knowing anything about statistics and probability, or the vast collection…

Continue

Added by Bill Luker Jr on April 28, 2014 at 6:51am — 2 Comments

MapReduce / Map Reduction Strategies Using C#

A Brief History of Map Reduction



Map and Reduce functions can be traced all the way back to functional programming languages such as Haskell and its Polymorphic Map function known as fmap.  Even before fmap there was the Haskell …

Continue

Added by Jake Drew Ph.D. on March 31, 2014 at 6:48am — No Comments

Big Data and 2014

We are witnessing a paradigm shift in Data Environment. In recent years, Big Data has risen on the technology horizons and is under the aspect of efficient and cost effective management and analysis of vast amounts of data for both public and private organizations. There are several organizations, which are trying to harness this continuing data stream, and in 2014, several of these organizations will go about making this data available in real time .

Any organization, that want to…

Continue

Added by Atif Farid Mohammad on December 8, 2013 at 10:05am — No Comments

Hadoop 2 Helps Systems Integration



Apache Hadoop announced a beta release for Hadoop 2. The Hadoop-2.1.0-beta…

Continue

Added by Michael Walker on September 3, 2013 at 9:31pm — No Comments

Spark, Shark and Mesos Data Analytics Stack

The Berkeley Data Analytics Stack (BDAS) is an open source, next-generation data analytics stack under development at the UC Berkeley AMPLab whose current components include …

Continue

Added by Michael Walker on February 27, 2013 at 10:08am — No Comments

The Fallacy of the Data Scientist Shortage

There is no question that the USA (in fact, most of the world) would be well-served with more quantitatively capable people to work in business and government. However, the current hysteria over the shortage of data scientists is overblown. To illustrate why, I am going to use an example from air travel.

On a recent trip from Santa Fe, NM to Phoenix, AZ, I tracked the various times:

 

Duration…

Continue

Added by Neil Raden on June 27, 2012 at 10:00am — No Comments

Monthly Archives

2019

2018

2017

2016

2015

2014

2013

2012

2011

1999

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service