Hadoop vs. NoSql vs. Sql vs. NewSql By Example



Click on the images for full view

Although Mainframe Hierarchical Databases are very much alive today, The Relational Databases (RDBMS) (SQL) have dominated the Database market, and they have done a lot of good. The reason the money we deposit doesn’t go to someone else’s account, our airline reservation ensures that we have a seat on the plane, or we are not blamed for something we didn’t do, etc… RDBMS' data integrity is due to its adherence to ACID (atomicity, consistency, isolation, and durability) principles. RDBMS technology dates back to the 70's.

So what changed? Web technology started the revolution. Today, many people shop on Amazon. RDBMS was not designed to handle the number of transactions that take place on Amazon every second. The primary constraining factor was RDBMS’ schema.

NoSql Databases offered an alternative by eliminating schemas at the expense of relaxing ACID principles. Some NoSql vendors have made great strides towards resolving the issue; the solution is called eventual consistency. As for NewSql, why not create a new RDBMS minus RDBMS’ shortcomings utilizing modern programming languages and technology. That is how some of the NewSql vendors came to life.  Other NewSql companies created augmented solutions for MySql.

Hadoop is a different animal altogether. It’s a file system and not a database. Hadoop’s roots are in  internet search engines. Although Hadoop and associates (Hbase, Mapreduce, Hive, Pig, Zookeeper) have turned it into a mighty database, Hadoop is a scalable, inexpensive distributed filesystem with fault tolerance. Hadoop’s specialty at this point in time is in batch processing, hence suitable for Data Analytics.

Now let’s start with our example: My imaginary video game company recently put our most popular game online after ten years of being in business, shipping our games to retailers around the globe. Our customer information is currently stored in a Sql Server Database , and we have been happy with it. However, since the players started playing the game online, the database is not able to keep up and the users are experiencing delays. As our user base grows rapidly, we spend money buying more and more Hardware/Software, but to no avail. Losing customers is our primary concern. Where do we go from here?

We decide to run our online game application in NoSql and NewSql simultaneously by segmenting our online user base. Our objective is to find the optimal solution. The IT department selects NoSql CouchBase (document oriented like MongoDB) and NewSql VoltDB.

Couchbase is open source, has an integrated caching mechanism, and it can automatically spread data across multiple nodes. VoltDB is an ACID compliant RDBMS, fault tolerant, scales horizontally, and possesses a shared-nothing & in-memory architecture. At the end, both systems are able to deliver. I won’t go into the intricacies of each solution because this is an example and comparing these technologies in the real-world will require testing, benchmarking, and in-depth analyses.

Now that the online operations are running smoothly, we want to analyze our data to find out where we should expand our territory. Which are the most suitable countries for marketing our products?  In doing so, we need to merge the Sql Server customer Data Warehouse with the data from the online gaming database,  and run analytical reports. That’s where Hadoop comes in. We configure a Hadoop system and merge the data from the two data sources. Next, we use Hadoop’s  Mapreduce in conjunction with the open source R  programming language to generate the analytics reports.

See Big Data Studio

Views: 36547


You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Priyanka Jain on August 20, 2015 at 10:00pm

Hi Fari Payandeh,

Big data has a lot of capacity to profit organizations in any kind of industry, ubiquitously in world. it is a useful to decision-making and helpful to improve the financial position of any organization and to get all those things organizations are growing with technology with higher then higher performance.


Priyanka jain - OLAP on Hadoop 

Comment by Riya Saxena on April 16, 2015 at 11:44pm

Great post! Thanks for sharing!  Each of these technologies are closely associated with big data, so there’s overlap in terms of what they are designed to do. For example, they’re great for managing large and rapidly growing data sets, and they’re great for handling a variety of data formats, even if those formats change over time. More at www.youtube.com/watch?v=1jMR4cHBwZE

Comment by Fari Payandeh on September 24, 2013 at 1:18am

Hi Glinca,

Not a problem. I know what are asking now. If your concern is indexes, VOLTDB uses Tree indexes and I'd think that it's the same or similar to RDBMS B+ tree indexes. The problem with VOLTDB is that it doesn't enforce data integrity at the Database Layer because unlike RDBMS, it  doesn't support referential integrity as part of its engine. The responsibility of ensuring data integrity has been shifted to developers and this is in my view  a problem with VOLTDB. So, if that is  part of your requirements then I would suggest that you stay with RDBMS like Mysql, Sql Server, Oracle, ... because data integrity is guaranteed as long as you configure it correctly (establishing foreign key- primary key relationships )

Comment by Fari Payandeh on September 23, 2013 at 12:36pm


I studied VoltDB's architecture and I can engage now, but I need to know what you mean by "your OldSQL and in VoldDB and in specilal the same number and type of indexes? " In other words what did read about indexes that made you pose the question?

Comment by Fari Payandeh on September 22, 2013 at 6:05am


I don't know the architecture of these systems well enough to answer the question. My objective was to give a high level overview of these technologies because there seems to be a lot of confusion out there. I need to learn a lot more...  my to-do list keeps growing...

Comment by Fari Payandeh on September 20, 2013 at 12:49pm


You are correct. I just didn't want to appear as advertizing for VoltDB, but that's my personal preference.

Comment by CARLOS ARAQUE on September 13, 2013 at 7:04pm

Thanks, article

Comment by Fari Payandeh on September 11, 2013 at 12:49pm

Thank you Adi!

I'm glad it was of value to you. I appreciate it.


Comment by Adi P on September 11, 2013 at 9:51am

Thank you for a great summary and personal experience example!

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service