It has not been long since the holy war between SQL and NoSQL database technologies faded, and now we see a new contender, NewSQL, rising. What is it? Will it cause another round of the war?
Recently at Bazaarvoice we hosted an informational session on VoltDB, one of the better known NewSQL solutions, with several engineers and technical managers from Austin and San Francisco offices participating in that session. The question we needed answered: what is VoltDB and why it might be an interesting datastore technology for us?
In short, it may be very good for real-time and near real time analytics, where SQL and ACID compliance are desirable. Personalized ad-serving, marketing campaign real-time effectiveness measuring, and electronic trading systems are some of the reference applications that VoltDB provides.
VoltDB is an in-memory database, which makes it extremely fast. However, this is just a small portion of the story. Besides residing in memory, VoltDB has a few performance improving architectural solutions based on research by well known database technologists, including the famous Michael Stonebraker, who was involved in the creation of Ingres, Postgres, and Vertica.
The creators of VoltDB wanted to preserve all the good features of a traditional RDBMS like SQL, ACID compliance, and data integrity, but they also wanted to drastically improve performance and scalability. All the modern commercial and open source RDBMS are built on the same principles, which were created more than 40 years ago for the era of small memory and slow disks. The researchers analyzed the bottlenecks of a traditional RDBMS and found that at high load about 88% of the server capacity is wasted on the traditional RDBMS overheads and only about 12% of the capacity is used for doing actual useful work.
VoltDB’s architectural solutions eliminate the traditional RDBMS overheads:
- The computing power is brought close to data. Data partitions have affinity to specific memory regions and CPU cores ( virtual nodes) in a shared-nothing cluster;
- Data is located in main memory which eliminates buffer management overhead, let alone access to painfully slow disks;
- Single-threaded virtual nodes operate on partitions autonomously to eliminate locking and latching overhead;
- Combination of continuous snapshots and command logging instead of writing db blocks to disks and transaction log for durability drastically reduces logging overhead.
These architectural solutions allow combining all the advantages of a traditional RDBMS with scalability features usually associated only with NoSQL databases: automatic sharding across a shared-nothing cluster, eliminating many overheads, automatic replication and log-less durability for high availability. With these features, VoltDB claims to be one of the fastest databases on the market today.
VoltDB’s impressive performance is illustrated by the results of the TPC-C-like benchmark below, in which VoltDB and a well known OLTP DBMS were compared running the same test on identical hardware (Dell R610, 2x 2.66Ghz Quad-Core Xeon 5550 with 12x 4GB (48GB) DDR3-1333 Registered ECC DIMMs, 3x 72GB 15K RPM 2.5in Enterprise SAS 6GBPS Drives):
So, will the rising NewSQL technology cause another religious database war? Probably not. VoltDB positions their database as a niche database doing a few things really well. It doesn’t try to be a one size fits all database, and VoltDB’s philosophy is “an organization should use a few datastore technologies, using each for the case where it plays the best.” For example, you cannot use VoltDB if your data does not fit into the combined memory of your cluster. Long leaving transactions are also not supported on VoltDB.
Hopefully, if some team has a need for a very fast but consistent, ACID and SQL compliant database for a new project, they would consider VoltDB as as an option.