Subscribe to DSC Newsletter

All Blog Posts Tagged 'HDFS' (7)

What are Data Lakes ?

Some of you are highly organized, disciplined, and keep all your belongings arranged in an ordered manner. You can easily remember the location of all these items and will be able to fetch them in a short time. But there are people who are highly unorganized and keeps all their belongings in a random manner and will have no clue on where to look for an item when they really need it. This results in a search operation on all storage spaces in the house when they want to fetch an item. There is… Continue

Added by Janardhanan PS on March 2, 2020 at 9:37pm — No Comments

Breaking the HDFS Performance Barrier; An Object Storage First

By Siddartha Mani

Few would argue with the statement that Hadoop HDFS is in decline. In fact, the HDFS part of the Hadoop ecosystem is in more than just decline - it is in freefall. At the time of its inception, it had a meaningful role to play as a high-throughput, fault-tolerant distributed file system. The secret sauce was data locality. 

By co-locating compute and data on the same nodes, HDFS overcame the limitations of slow network access to data. The…

Continue

Added by Jonathan Symonds on August 6, 2019 at 1:00pm — No Comments

Hadoop for Beginners- Part 1

This blog is to give brief introduction about Hadoop for those who know next to nothing about this technology. Big Data is at the foundation of all the megatrends that are happening today, from social to the cloud to mobile devices to gaming. This blog will help to build the foundation to take the next step in learning this interesting technology. Let's get started:

1. What's Big Data?

Ever since…

Continue

Added by Aafrin Dabhoiwala on August 26, 2018 at 12:30pm — 1 Comment

Limitations of Hadoop – How to overcome Hadoop drawbacks

Hadoop – Introduction & features

Let us start with what is Hadoop and what are Hadoop features that make it so popular.

Hadoop is an open-source software framework for distributed storage and distributed processing of extremely large data sets. Important features of Hadoop are:

Hadoop is an open source project. It means its code can be modified to business requirements.

In Hadoop, data is highly available and…

Continue

Added by Sheetal Sharma on July 31, 2017 at 7:30pm — No Comments

The Hadoop Ecosystem: HDFS, Yarn, Hive, Pig, HBase and growing...

Hadoop is the leading open-source software framework developed for scalable, reliable and distributed computing. With the world producing data in the zettabyte range there is a growing need for cheap, scalable, reliable and fast computing to process and make sense of all of this data. The underlying technology for Hadoop framework was created by Google as there…

Continue

Added by Zygimantas Jacikevicius on November 25, 2015 at 1:20am — 4 Comments

How Uber Uses Spark

Added by Bradley Wogsland on October 25, 2015 at 8:00am — No Comments

Hadoop 2 Helps Systems Integration



Apache Hadoop announced a beta release for Hadoop 2. The Hadoop-2.1.0-beta…

Continue

Added by Michael Walker on September 3, 2013 at 9:31pm — No Comments

Monthly Archives

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

1999

Videos

  • Add Videos
  • View All

© 2020   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service