Skip to content

New SQL RDBMS Architectures Vs Old ones Vs NoSQL

Those recent years we have seen the NoSQL initiative emerging against the so-called “old, slow and legacy” relational DBs. But today the debate is extending with a newcomer “The New RDMS architecture”. The concept is simple, the RDMS architecture was developed a long time ago, at that time computer science,  computers and processors were extremely different from what we can find today. Why are we not able to gather the recent researches in storage, distributed computing and threading system to re-design a modern RDMS?

That is the case of several open source projects such as INGRESS [1] or VoltDB [2].  I really recommend to spend time to read the excellent post VoltDB Decapitates Six SQL Urban Myths and Delivers Internet Scale OLTP in the Process.

The author positions VoltDB against the NoSQL:  if you can get a relational database with all the scalable ACIDy goodness, why would you ever stoop to using a NoSQL database that might only ever be eventually reliable?  But also by tagging a traditional RDMS as legacy architecture and therefore not scalable not efficient.

Other similar initiatives try to approach the problem in a same way. Hadoop DB [3] uses normal RDMS on each cluster node but exposes a Map/reduce API based on HDFS for generating the query plan on the nodes. The user expresses the SQL statement as usual, but it will be interpreted into HDFS query by the Hive framework [6].

AsterData [4] or Greenplum [5] uses directly a variant of SQL, the SQL map reduce. In this case the SQL statement references coded-procedure which are executed according to MR at run time. As result, the RDMS is both a relational DB and a data application server.

The debate is not NoSQL Vs RDMS anymore, new approaches enforce the relational side but create new debate between RDMS generation. At the end of the day the main question remains: do all address the same issue ? Of course not.  The application architect will have to study all those different variants and define which one suit for the particular needs of the application. However, the story becomes more complex when facing computer science challenges. Do you really think that the storage system for Facebook must be the same as for a BI system?  But what happens if the BI system must deal with the same amount of data that Facebook and in near real time, is is still the same problem?

References

[1] INGRESS Vector wise technology, http://www.ingres.com/vectorwise/

[2] VoltDB, http://voltdb.com/

[3] The HadoopDB project, http://db.cs.yale.edu/hadoopdb/hadoopdb.html

[4] AsterData, http://www.asterdata.com/

[5] Greenplum, http://www.greenplum.com/

[6] The Hive project, http://hadoop.apache.org/hive/


 

Releated Posts

Evaluation of GraphRAG Strategies for Efficient Information Retrieval

Traditional RAG systems struggle to capture relationships and cross-references between different sources unless explicitly mentioned. This challenge is common in real-world scenarios, where information is often distributed and interlinked, making graphs a more effective representation. Our work provides a technical contribution through a comparative evaluation of retrieval strategies within GraphRAG, focusing on context relevance rather than abstract metrics. We aim to offer practitioners actionable insights into the retrieval component of the GraphRAG pipeline.
Read More

Flight Load Factor Predictions based on Analysis of Ticket Prices and other Factors

The ability to forecast traffic and to size the operation accordingly is a determining factor, for airports. However, to realise its full potential, it needs to be considered as part of a holistic approach, closely linked to airport planning and operations. To ensure airport resources are used efficiently, accurate information about passenger numbers and their effects on the operation is essential. Therefore, this study explores machine learning capabilities enabling predictions of aircraft load factors.
Read More