Subscribe to DSC Newsletter

Kumar Chinnakali's Blog Posts Tagged 'Apache' (7)

Self-Learn Yourself Apache Spark in 21 Blogs – #5

In Blog 5, we will see Apache Spark Languages with basic Hands-on. Click to have quick read on the other blogs of Apache Spark in this learning series.

With our cloud setup of our Apache Spark now we are ready to develop big data Spark applications. And before getting started with building Spark applications let’s review the languages which can be used to develop Apache Spark applications. It has many APIs like Scala, Hive, R, Python, Java, and Pig.

Scala – It’s the language…

Continue

Added by Kumar Chinnakali on January 23, 2016 at 3:32am — No Comments

Celebrate the Big Data Problems – #2

Celebrate the Big Data Problems – #2

How to identify the no of buckets for a Hive table while executing the HiveQL DDLs ?

The dataottam team has come up with blog sharing initiative called “Celebrate the Big Data Problems”. In this series of blogs we will share our big data problems using CPS (Context, Problem, Solutions) Framework.

Context:

Bucketing is another…

Continue

Added by Kumar Chinnakali on January 21, 2016 at 7:41pm — No Comments

Celebrate the Big Data Problems – #1

Celebrate the Big Data Problems – #1

Daily we are facing many big data problems in production, PoC, and more perspective. Do we have any common repo to collect and share?  No, as we know we don’t have any. As always dataottam is looking forward to share the learnings with community to celebrate their similar, same kind of problems.  And…

Continue

Added by Kumar Chinnakali on January 15, 2016 at 11:30pm — No Comments

Self-Learn Yourself Apache Spark in 21 Blogs – #4

In Blog 4, we will see what are Apache Spark Core and its ecosystem and Apache Spark on AWS Cloud. Click to have quick read on blog 1-3 in this learning series.

Apache Spark has many components including Spark Core which is responsible for Task Scheduling, Memory Management, Fault Recovery, and Interacting with storage…

Continue

Added by Kumar Chinnakali on January 12, 2016 at 8:00am — No Comments

Self-Learn Yourself Apache Spark in 21 Blogs – #3

In this Blog 3 – We will see what is Apache Spark’s History and Unified Platform for Big Data, and like to have quick read on blog 1 and blog 2.

Spark was initially started by Matei at UC Berkeley AMPLab in 2009, and open sourced in 2010…

Continue

Added by Kumar Chinnakali on January 9, 2016 at 9:00pm — 1 Comment

Self-Learn Yourself Apache Spark in 21 Blogs – #2

By this blog we will share the titles for learning Apache Spark, Basics on Hadoop which is one of the big data tool, and motivations for Apache Spark which is not replacement of Apache Hadoop, but its friend of big data.

Blog 1 – Introduction to Big Data

Blog 2 – Hadoop, Spark’s Motivations

Blog 3 – Apache Spark’s History and Unified Platform for Big Data

Blog 4 – Apache Spark’s First Step – AWS, Apache Spark

Blog 5 – Apache Spark Languages with basic…

Continue

Added by Kumar Chinnakali on January 8, 2016 at 9:00pm — No Comments

Self-Learn Yourself Apache Spark in 21 Blogs - #1

We have received many requests from friends who are constantly reading our blogs to provide them a complete guide to sparkle in Apache Spark. So here we have come up with learning initiative called “Self-Learn Yourself Apache Spark in 21 Blogs".

We have drilled down various sources and archives to provide a perfect learning path for you to understand and excel in Apache Spark. These 21 blogs which will be written over a course of time will be a complete guide for you to understand and…

Continue

Added by Kumar Chinnakali on December 30, 2015 at 3:00am — No Comments

Videos

  • Add Videos
  • View All

© 2020   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service