Subscribe to DSC Newsletter

Kumar Chinnakali's Blog (14)

Functions for basics statistics in R

How to have our basic statistics (Mean, Median, SD, Var, Cor, Cov) computed using R language?

The dataottam team has come up with blog sharing initiative called “Celebrate the Big Data Problems”. In this series of blogs we will share our big data problems using CPS (Context, Problem, Solutions) Framework.

Context:

In statistics Mean, Median, Standard Deviations, Variance, Correlation, or…

Continue

Added by Kumar Chinnakali on January 30, 2016 at 2:30am — No Comments

9 Key Benefits of Data Lake

A Data Lake has flexible definition, to make this statement true the dataottam team took initiative and released a eBook called “The Collective Definition of Data Lake by Big Data Community”, which contains many definitions from various business savvy and technologist.

And in nutshell Data Lake is a data store and processing data system, where an…

Continue

Added by Kumar Chinnakali on January 28, 2016 at 6:30pm — No Comments

Self-Learn Yourself Apache Spark in 21 Blogs – #5

In Blog 5, we will see Apache Spark Languages with basic Hands-on. Click to have quick read on the other blogs of Apache Spark in this learning series.

With our cloud setup of our Apache Spark now we are ready to develop big data Spark applications. And before getting started with building Spark applications let’s review the languages which can be used to develop Apache Spark applications. It has many APIs like Scala, Hive, R, Python, Java, and Pig.

Scala – It’s the language…

Continue

Added by Kumar Chinnakali on January 23, 2016 at 3:32am — No Comments

Celebrate the Big Data Problems – #2

Celebrate the Big Data Problems – #2

How to identify the no of buckets for a Hive table while executing the HiveQL DDLs ?

The dataottam team has come up with blog sharing initiative called “Celebrate the Big Data Problems”. In this series of blogs we will share our big data problems using CPS (Context, Problem, Solutions) Framework.

Context:

Bucketing is another…

Continue

Added by Kumar Chinnakali on January 21, 2016 at 7:41pm — No Comments

Self-Learn Yourself IoT in 21 Blogs – #1

Self-Learn Yourself IoT in 21 Blogs – #1 – In this we will be seeing What is IoT ? Why do we need it? Significance & Impact on Modern life?

Time to Greet the New Clone that it set to rule the world !

Hello All ! Well, this is my first blog for dataottam and so henceforth I welcome your valuable feedback and that in return helps me to deliver better!

Alright, what is Internet of Things (IoT) ? How does it differ from Internet of Everything ? What is M2M ?

All…

Continue

Added by Kumar Chinnakali on January 21, 2016 at 8:30am — No Comments

Celebrate the Big Data Problems – #1

Celebrate the Big Data Problems – #1

Daily we are facing many big data problems in production, PoC, and more perspective. Do we have any common repo to collect and share?  No, as we know we don’t have any. As always dataottam is looking forward to share the learnings with community to celebrate their similar, same kind of problems.  And…

Continue

Added by Kumar Chinnakali on January 15, 2016 at 11:30pm — No Comments

Just 3 clicks to get your Apache Hadoop installed!

Big Data is problem statement and it can be solved with one of the tools like Apache Hadoop. But having Apache Hadoop as infra to do our proof of concepts, proof of values is little challenging. Hence we brought 3 click ideas to have your Apache Hadoop installed.

What is Perquisite?

  • Ubuntu 14.04
  • Internet Connection

Can I have the Script? Yes

How…

Continue

Added by Kumar Chinnakali on January 12, 2016 at 9:53am — No Comments

Self-Learn Yourself Apache Spark in 21 Blogs – #4

In Blog 4, we will see what are Apache Spark Core and its ecosystem and Apache Spark on AWS Cloud. Click to have quick read on blog 1-3 in this learning series.

Apache Spark has many components including Spark Core which is responsible for Task Scheduling, Memory Management, Fault Recovery, and Interacting with storage…

Continue

Added by Kumar Chinnakali on January 12, 2016 at 8:00am — No Comments

Self-Learn Yourself Apache Spark in 21 Blogs – #3

In this Blog 3 – We will see what is Apache Spark’s History and Unified Platform for Big Data, and like to have quick read on blog 1 and blog 2.

Spark was initially started by Matei at UC Berkeley AMPLab in 2009, and open sourced in 2010…

Continue

Added by Kumar Chinnakali on January 9, 2016 at 9:00pm — 1 Comment

Self-Learn Yourself Apache Spark in 21 Blogs – #2

By this blog we will share the titles for learning Apache Spark, Basics on Hadoop which is one of the big data tool, and motivations for Apache Spark which is not replacement of Apache Hadoop, but its friend of big data.

Blog 1 – Introduction to Big Data

Blog 2 – Hadoop, Spark’s Motivations

Blog 3 – Apache Spark’s History and Unified Platform for Big Data

Blog 4 – Apache Spark’s First Step – AWS, Apache Spark

Blog 5 – Apache Spark Languages with basic…

Continue

Added by Kumar Chinnakali on January 8, 2016 at 9:00pm — No Comments

The Collective Definition of Data Lake by Big Data Community

The term Data Lake has been gaining popularity recently as most of the enterprises have incorporated it into their analytics software’s. Every word and phrase that is used to describe Data Lake have provided us much useful information about how we interpret it.

So we at dataottam decided to understand the various ways Data Lake could be defined. So we conducted a survey and found very interesting thoughts, words and phrases used for defining Data Lake, from developers to founders, to…

Continue

Added by Kumar Chinnakali on December 30, 2015 at 4:00am — 4 Comments

Self-Learn Yourself Apache Spark in 21 Blogs - #1

We have received many requests from friends who are constantly reading our blogs to provide them a complete guide to sparkle in Apache Spark. So here we have come up with learning initiative called “Self-Learn Yourself Apache Spark in 21 Blogs".

We have drilled down various sources and archives to provide a perfect learning path for you to understand and excel in Apache Spark. These 21 blogs which will be written over a course of time will be a complete guide for you to understand and…

Continue

Added by Kumar Chinnakali on December 30, 2015 at 3:00am — No Comments

The Collective Definition of Data Lake by Big Data Community

Yes, we are marching towards New Year 2016!  What happened to Resolution of 2014, 2015? Quit Habits? Practice Habits? Road ahead? Am into all, but i could not able to keep it up. Hence this New Year 2016 is no more resolutions, just implement the plan.

Extend to that, as we know big data is bringing more business value to enterprise by leveraging the data lake. Data Lake..... What is that? Data Lake is loosely defined word and the definition gets changed during implementation…

Continue

Added by Kumar Chinnakali on December 2, 2015 at 5:00am — No Comments

Acronyms of Big Data Analytics from A to Z !

AQL - Annotation Query Language

AOSD - Aspect-Oriented Software Development

ACID - Atomicity, Consistency, Isolation and Durability

BDA - Big Data Analytics

CQL - Cypher Query Language

CQL - Cassandra Query Language

CQL - Contextual/Common Query Language

COTS - Commodity off-the-shelf

CART - Classification and Regression Trees

CCA - Canonical Correlational Analysis

CEP - Complex Event Processing

DAD - Discover, Access, Distill

3DM -…

Continue

Added by Kumar Chinnakali on May 21, 2014 at 1:46pm — 1 Comment

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service