In chemistry, we have strong and weak ACIDs. There exists strong and weak BASEs also. In Relational Database Management Systems (RDBMS), ACID stands for Atomicity, Consistency, Isolation and Durability. The volume of ACID transactions handled by RDBMS has undergone a big data transition with the evolution of mobile devices. SQL stands for Structered Query Language, but commonly refers to traditional RDBMS. RDBMS is not designed to be distributed and optimized for space, not for speed of access. In RDBMS, we have only strong ACID and there is no mechanism to change the ACIDity. This strong ACID has become a bottleneck in scaling RDBMS to handle large volumes of transactional data. Before the evolution of web transactions, we had transactions under Unix extended for distributed operations to handle transactions spanning heterogeneous databases. These systems were able to handle the transactions originating from limited number of dedicated POS terminals attached to servers. Todays systems are distributed and the transactions originate from all sorts of devices located at any part of the world. And these transactions get committed in a database existing in centralized remote locations.
In the present era of web transactions, it has become a challenge to define the transaction boundaries. When the begin and commit operations of transactions executed on systems located at remote geographic locations, it is not viable to define the transaction boundaries on these devices. If we do this, it will be difficult to keep track of the outstanding transactions pending for commit operations. The number of these outstanding transactions may go beyond the capabilities of latest servers. To overcome the problem, we can reduce the geographical distance between the transaction boundaries by bringing only a portion of the transaction under the scope of ACID. Usually this portion gets defined on middleware systems running on servers in a single data center. This minimizes the number of outstanding transactions pending for commit operation. And only the success or failure of transactions gets communicated to users on the remote devices.
Instead of adhering to strong ACID, can we dilute the ACID in RDBMS ? It is not possible to do this because of millions of applications designed on top of the ACID based RDBMS. This made IT workers to think about a new generation of databases called NoSQL databases making use of BASE property which allows dilution to desired levels. BASE stands for Basically Available, Soft state, Eventually consistent. NoSQL database technologies are a better match for the needs of modern interactive web based software systems. Software professionals can select desered level of BASicity to match the requirement of the application. Basic availability is guaranteed by distributing copies of data on clusters. Soft state means inconsistencies or stale answers are to be expected. Eventual consistency means when no updates occur for a long time, all updates will propagate through the distributed system and eventually data on all the systems will be same, ie consistent.
The roots of BASE is in the CAP theorem brought out by Eric Brewer, professor at University of California. The CAP theorem states that a NoSQL database can have any two attributes from Consistency, Availability and Partition tolerance. In NoSQL databases, consistency is not guaranteed immediately. The only assurance is that the update/transaction will be eventually consistent. Tunable consisteny levels provided by NoSQL databases like Cassandra help us to select required BASicity for update operations. There are different levels of tunable consistencies provided in NoSQL databases. This makes NoSQL databases easily scale with the volume of transactions by adjusting the BASicity, ie relaxing the consistency level. So, the database management systems have transformed from ACID to BASE. And the root cause of this transformation is the need for the volume scalability arising from big data. Strong ACIDs reacts fast and a read after write always give consistent results. The speed of BASE reactions depends on BASicity and immediate read after a write may give inconsistent results. But over a period of time, BASE completes the reaction and we will start to get consistent results. So, BASE guarantees eventual consistency and the time taken for the transition from inconsistent to consistent state depends on the consistency level selected at the design stage of the application.
We live in an era of SQL to NoSQL transition. In the past, we had hierachical databases. Similar to the hiearchical databases, SQL databases may also face extinction and become a footnote in history. Your survival depends on your NoSQL knowledge and you may dedicate more time to understand it. Hope you have enjoyed the chemistry behind the transition from ACID to BASE, ie SQL to NoSQL.
See you next time ..........
Machine Learning Evangelist