By this blog we will share the titles for learning Apache Spark, Basics on Hadoop which is one of the big data tool, and motivations for Apache Spark which is not replacement of Apache Hadoop, but its friend of big data.
Blog 1 – Introduction to Big Data
Blog 2 – Hadoop, Spark’s Motivations
Blog 3 – Apache Spark’s History and Unified Platform for Big Data
Blog 4 – Apache Spark’s First Step – AWS, Apache Spark
Blog 5 – Apache Spark Languages with basic Hands-on
Blog 6 – The RDD, RDDs Input, Hands-on
Blog 7 – Transformation, map, mapPartitions
Blog 8 – RDD Combiner
Blog 9 – Actions, Persistence Actions, Hands-on
Blog 10 – Implicit Conversions, Hands-on
Blog 11 – Key Value Methods
Blog 12 – Caching Data, Hands-on
Blog 13 – Accumulator
Apache Hadoop is an open source technology which is the big data management platform and most associated with big data analytics applications. The distributed processing framework was created in 2006, primarily at Yahoo and based partly on ideas outlined by Google in a pair of technical papers; soon, other Internet companies such as Facebook, LinkedIn and Twitter adopted the technology and began contributing to its development. In the past few years, Hadoop had evoled into a complex ecosystem of infrastructure components and related tools, which are packaged together by various vendors in commercial Hadoop distributions.
One of the best tutorials on Hadoop thanks to Yahoo team.
Below are the pointers on why Apache Spark and Motivations for Apache Spark…
In Blog 3 – We will share the detaled study on Apache Spark’s History and Unified Platform for Big Data.
Originally posted here.
© 2019 Data Science Central ®
Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
Most popular articles
You need to be a member of Data Science Central to add comments!
Join Data Science Central