Skip to content

New SQL RDBMS Architectures Vs Old ones Vs NoSQL

Those recent years we have seen the NoSQL initiative emerging against the so-called “old, slow and legacy” relational DBs. But today the debate is extending with a newcomer “The New RDMS architecture”. The concept is simple, the RDMS architecture was developed a long time ago, at that time computer science,  computers and processors were extremely different from what we can find today. Why are we not able to gather the recent researches in storage, distributed computing and threading system to re-design a modern RDMS?

That is the case of several open source projects such as INGRESS [1] or VoltDB [2].  I really recommend to spend time to read the excellent post VoltDB Decapitates Six SQL Urban Myths and Delivers Internet Scale OLTP in the Process.

The author positions VoltDB against the NoSQL:  if you can get a relational database with all the scalable ACIDy goodness, why would you ever stoop to using a NoSQL database that might only ever be eventually reliable?  But also by tagging a traditional RDMS as legacy architecture and therefore not scalable not efficient.

Other similar initiatives try to approach the problem in a same way. Hadoop DB [3] uses normal RDMS on each cluster node but exposes a Map/reduce API based on HDFS for generating the query plan on the nodes. The user expresses the SQL statement as usual, but it will be interpreted into HDFS query by the Hive framework [6].

AsterData [4] or Greenplum [5] uses directly a variant of SQL, the SQL map reduce. In this case the SQL statement references coded-procedure which are executed according to MR at run time. As result, the RDMS is both a relational DB and a data application server.

The debate is not NoSQL Vs RDMS anymore, new approaches enforce the relational side but create new debate between RDMS generation. At the end of the day the main question remains: do all address the same issue ? Of course not.  The application architect will have to study all those different variants and define which one suit for the particular needs of the application. However, the story becomes more complex when facing computer science challenges. Do you really think that the storage system for Facebook must be the same as for a BI system?  But what happens if the BI system must deal with the same amount of data that Facebook and in near real time, is is still the same problem?

References

[1] INGRESS Vector wise technology, http://www.ingres.com/vectorwise/

[2] VoltDB, http://voltdb.com/

[3] The HadoopDB project, http://db.cs.yale.edu/hadoopdb/hadoopdb.html

[4] AsterData, http://www.asterdata.com/

[5] Greenplum, http://www.greenplum.com/

[6] The Hive project, http://hadoop.apache.org/hive/


 

Releated Posts

Privacy Enhancing Technologies 2024: A Summary

For Large Language Models (LLMs), Azure confidential computing offers TEEs to protect data integrity throughout various stages of the LLM lifecycle, including prompts, fine-tuning, and inference. This ensures that all
Read More

IEEE Big Data 2023 – A Summary

Our CTO, Sabri Skhiri, recently travelled to Sorrento for IEEE Big Data 2023. In this article, Sabri explores for you the various keynotes and talks that took place during the
Read More