Open Source tools for Big Data

Click on the image for full view

It was not easy to select a few out of many Open Source projects. My objective was to choose the ones that fit Big Data’s needs most. What has changed in the world of Open Source is that the big players have become stakeholders; IBM’s alliance with Cloud Foundry, Microsoft providing a development platform for Hadoop, Dell’s Open Stack-Powered Cloud Solution, VMware and EMC partnering on Cloud, Oracle releasing its NoSql database as Open Source.

“If you can’t beat them, join them”. History has vindicated the Open Source visionaries and advocates.

Hadoop Distributions


Cloud Operating System

Cloud Foundry -- By VMware

OpenStack -- Worldwide participation and well-known companies


fusion-io -- Not open source, but very supportive of Open Source projects; Flash-aware applications.

Development Platforms and Tools

REEF -- Microsoft's Hadoop development platform

Lingual -- By Concurrent

Pattern -- By Concurrent

Python -- Awesome programming language

Mahout -- Machine learning programming language

Impala -- Cloudera

R -- MVP among statistical tools

Storm -- Stream processing by Twitter

LucidWorks -- Search, based on Apache Solr

Giraph -- Graph processing by Facebook

NoSql Databases

MongoDB, Cassandra, Hbase

Sql Databases

MySql -- Belongs to Oracle

MariaDB -- Partnered with SkySql

PostgreSQL -- Object Relational Database

TokuDB -- Improves RDBMS performance

Server Operating Systems

Red Hat -- The defacto OS for Hadoop Servers

BI, Data Integration, and Analytics




See Big Data Studio

Views: 4328

Tags: BigData, BigData-OpenSource, Opensource, Opensource-Bigdata


You need to be a member of Data Science Central to add comments!

Join Data Science Central

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service