FOSDEM 2010: The Raise of the NoSQL initiative

Data architecture, Engineering

February 8, 2010

What’s NoSQL?

Even if the name is really meaningless, the NoSQL defines a new generation of Key/value pair storage. This initiative is gaining popularity but also maturity. The FOSDEM dedicated a complete day and dev. room for this subject. The wikipedia definition defines this movement as: “NoSQL is an umbrella term for a loosely defined class of non-relational data stores that break with a long history of relational databases and ACID guarantees. Data stores that fall under this term may not require fixed table schemas, and usually avoid join operations. The term was first popularised in early 2009. Trends in computer architectures are pressing databases in a direction that requires horizontal scalability. NoSQL-style data stores attempt to address this requirement. Prominent closed-source examples are Google‘s BigTable and Amazon‘s Dynamo. Several open-source variants exist including Facebook‘s Cassandra, Apache HBase, LinkedIn‘s Project Voldemort and many others.”

Check up before swapping out your relational DB

Those projects look very promising and really efficient, however, if you plan to include such technologies, you must be sure it fits your needs. Therefore, a first analysis on specific attention points must be realized:

Consistency model analysis: most of these projects implement an Eventually consistent model [1] which requires a repair mechanism. Does it fit to your concurrency policy ? For instance with Multiple Version Concurrency Control (MVCC) with optimistic locking, such as JBoss Mobicents JSLEE does, this consistency model is not directly compatible. Who must provide the repair mechanism, the storage engine or clients? Do you have the flexibility to choose with the storage you want to use ? In addition, there are a number of practical improvements to the eventual consistency model, such as session-level consistency and monotonic reads, which provide better guarantee to the client application
Scalability model analysis: you must analyze how those storage systems scales in term of indexing, sharding distribution configuration and access, request routing system, optimization, etc. For instance, MongoDB [2] does not use the consistent hashing algorithm for routing requests and finding shardings. Instead it uses an internal Router and location tables. The advantage is the flexibility that let you use any data you want as index. The point is that it uses an optimization for storing information in the in-memory store. If you haven’t got much data, it stores the data directly in the in-memory store, if you have more important data, it only stores indexes, and if you have a huge data volume it only keeps the portion of the indexes you need. That makes mongoDB very useful as web app back end, but not efficient for applications which store million of data and access them randomly.
Transactional model: how are the transactions managed, is there a safe mode which does not affect performance?
Client API and Query model: for instance HBase [3] provides a really simple interface without a real query model while mongoDB can embed JQuery[4].

The usage of NoSQL storage is really promising, but we should carefully analyze the impacts of their architectures on your applications before swapping out your RDMS.

References:

[1] Werner Vogels, Eventually consistent, December 2008, http://www.allthingsdistributed.com/2008/12/eventually_consistent.html

[2] The MongoDB project, http://www.mongodb.org

[3] The HBase project, http://hadoop.apache.org/hbase/

[2] The JQuery project, http://jquery.com/

Releated Posts

Muppet: A Modular and Constructive Decomposition for Perturbation-based Explanation Methods

4.08.2025 / Data science / Papers

The topic of explainable AI has recently received attention driven by a growing awareness of the need for transparent and accountable AI. In this paper, we propose a novel methodology to decompose any state-of-the-art perturbation-based explainability approach into four blocks. In addition, we provide Muppet: an open-source Python library for explainable AI.

Insights from GTC Paris 2025

25.06.2025 / Engineering / Blog, Event

Among the NVIDIA GTC Paris crowd was our CTO Sabri Skhiri, and from quantum computing breakthroughs to the full-stack AI advancements powering industrial digital twins and robotics, there is a lot to share! Explore with Sabri GTC 2025 trends, keynotes, and what it means for businesses looking to innovate.

FOSDEM 2010: The Raise of the NoSQL initiative

What’s NoSQL?

Check up before swapping out your relational DB

References:

Releated Posts

Muppet: A Modular and Constructive Decomposition for Perturbation-based Explanation Methods

Insights from GTC Paris 2025

Recent Posts

Muppet: A Modular and Constructive Decomposition for Perturbation-based Explanation Methods

Insights from GTC Paris 2025

Development & Evaluation of Automated Tumour Monitoring by Image Registration Based on 3D (PET/CT) Images

Insights from Data & AI Tech Summit Warsaw 2025

Tracks

Mjolnir

Rune

Vadgelmir

Yggdrasil

Field of expertises

Data architecture

Data governance

Data science

Engineering

Academic collaboration

SERVE

Expertise

CRAFT

digazu

CONTACT

Belgium

France

Tunisia

CAREER

Job Offers

Social media