What’s holding back in-memory databases?

What’s holding back in-memory databases?

While in-memory database technology offers huge performance improvements over conventional databases, a number of key shortcomings have held back adoption. But that may soon change. Graeme Burton reports

In 2013, SAP co-founder and chairman Hasso Plattner played host at the launch of SAP Hana for ERP. "Thanks to the power of SQL, only one version of the suite will go forward, using not only Hana, but IBM DB2, Oracle, Microsoft SQL Server, and SAP-ASE as well."

That last database, of course, was the re-branded Sybase database, which SAP picked up in 2010 for $5.8bn. Far from pushing that database technology to customers in preference to Oracle – with whom SAP has had a fraught relationship in the past – SAP seems keener to push Hana instead.

But it isn’t just SAP that has been promoting in-memory database technology for running conventional applications: Microsoft SQL Server 2014 offers in-memory database capabilities, which Microsoft claim offers up to 30 times the performance of conventional database technology for transactional applications – and more than 100 times for queries.

Oracle, too, offers not just TimesTen, based on the in-memory database technology of the same name it acquired in 2005, but in-memory options for its 12c database, intended for both analytics and mixed workload online transaction processing. Furthermore, it also now has an in-memory version of the Oracle E-Business Suite, optimised for use with in-memory database technology, just like SAP.

But despite the immense performance improvements that all three software giants claim for applications running on their respective in-memory databases, customers remain relatively unmoved.

Indeed, an August survey by the American SAP User Group (ASUG) – encompassing more than 500 SAP customers and partners – indicated that users are holding back, concerned at the cost and complication of running applications on SAP Hana. "Three-quarters of the customers who said they have not yet purchased any SAP Hana products say they can’t identify a business case that justifies the cost," claimed ASUG.

Where SAP Hana had found a bedrock of adoption, the survey suggested, was in analytics – particularly in the finance department. "For those customers that had purchased a Hana product, 65 per cent of them said BW [Business Warehouse] on Hana was the one for them. Following BW was Business Suite on Hana, Hana custom analytics, Hana enterprise applications, and either Hana Enterprise Cloud or Hana Cloud Platform," said ASUG.

Respondents who hadn’t yet purchased Hana also named Business Warehouse as the most likely application they would buy if their organisation were to invest in Hana. In other words, while some early adopters were willing to bet their businesses running operational systems on an in-memory database, most preferred to dip their toes in the water with something less mission-critical.

Key challenge

While in-memory databases offer a significant leap in performance, they also have one big disadvantage that hasn’t yet been satisfactorily overcome, argues Gartner vice president and distinguished analyst Donald Feinberg, and which is hampering more widespread adoption.

"Here’s the problem: if an in-memory database goes down you have all kinds of issues with recovery because memory is volatile – you lose everything. So applications have to be aware of that when they are doing ‘commit’; when they are doing ‘transaction control’. They have to understand how the in-memory database is working so that they can assure the consistency of a transaction," says Feinberg. "And that’s just one example."

While a high-availability environment can be set up, that’s not necessarily an easy task, he adds. "It doesn’t have to be a catastrophic failure if you have a proper high-availability set-up for the in-memory database. However, it’s not as easy to set that up for an in-memory database as it is for a database running on disk.

"When you say ‘commit’ with a disk-based system, that data is written to disk. If you have got Raid turned on you have got multiple copies of it. You are not going to lose anything. That’s where some of the procedures and the way that you do coding [in your applications] with an in-memory database has to be different," says Feinberg.

It is also why SAP, Oracle and others have had to offer specialised versions of their applications for running on in-memory databases, despite the fact that an in-memory database is also based on conventional SQL, just like SAP-ASE and Oracle’s popular database product.

Indeed, while disk-based databases can use Raid to ensure that data is backed up on-the-fly, in-memory databases need alternative workarounds to provide similar assurance. "What in-memory databases have to do is either write log information out to some form of persistent storage. In the case of SAP Hana and Microsoft SQL Server, users are going to write to flash-based storage because writing to flash is much faster than writing to disk," says Feinberg

He continues: "Data on the flash device may eventually be written out to hard disk in case something happens. And there is a concept called remote direct memory access (RDMA), which enables the in-memory database to write to a second server."

That is to say, using RDMA, anything written to one server is automatically backed up to a second server. "That could be done synchronously with little loss of speed and performance," he says, although some latency is nevertheless introduced as a result of the process. "You have to create a persistence model in case power or connectivity is lost to the server," he adds.

What is more, resilience has been built into conventional databases so that if they do go down they can simply and quickly be restarted. With an in-memory database, that procedure is not quite so straightforward.

"With a conventional database the data is on-disk. But with an in-memory database, you have to rebuild that database because the database is gone. You have to rebuild it starting with a snapshot and do all the transactions from the log file to bring it up-to-date. That can take time – half-an-hour – which might be unacceptable for some transaction processing applications," says Feinberg.

Back to the nineties

That is why the majority of SAP Hana and other in-memory databases are predominantly used in non-mission-critical systems. However, SAP Hana, in particular, can boast some big names who have been skilling up. These include such giants as Unilever and Vodafone.

For Unilever, SAP Hana is used for analytics, attached to each of its four instances of SAP enterprise resource planning software that services its 200 subsidiaries. "The [SAP Hana] initiative doesn’t replace, but rather complements, our important global enterprise data warehouse strategy for reporting and analytics, where ERP data is extracted, transformed, and loaded, as well as combined with data external to our ERP systems," says Marc Bechet, global ERP vice president at Unilever.

Vodafone, meanwhile, is using SAP Hana to cut the length of time it takes to tabulate end-of-quarter financial figures. It used SAP Hana to conduct calculations that had previously been performed by a custom report within the SAP NetWeaver Business Warehouse application. In the process, it claims that it was able to cut data extract, processing, and load times from two-and-a-half hours down to just six minutes – a reduction of 96 per cent.

However, the real proof that SAP Hana – and in-memory database processing – has gone mainstream will be when customers start using it for operational systems.

This shift will occur, believes Feinberg, because the economics are slowly shifting as price per gigabyte for memory continues to fall, while the performance benefits become more and more compelling – provided the resilience issues are overcome. "Instead of buying a server with 64 gigabytes of memory, organisations will be buying servers with one terabyte of memory and the difference in cost could be three or four times," he says.

But because the applications will be able to run faster, users can put more applications on one server or consolidate the number of servers required to run, for example, their enterprise applications. This won’t just cut the amount of space required in the data centre, but cut the cost of cooling, hardware maintenance and multiple other costs.

Already, these savings in ancillary costs are starting to tip the economic balance. "I honestly believe that the total cost of ownership with an in-memory system is going to be cheaper over a three-to-five year period than with a conventional database," he adds.

Not only that, this is already driving the hardware market back to the 1990s. "If you go back 15 years, we were creating very large, single servers made by companies like Pyramid, Sequent, IBM and Hewlett-Packard with 128 cores. That technology was based on symmetric multi-processing (SMP)," he continues.

This hardware ultimately proved costly to buy and maintain at the time, and the market shifted towards clustering. "The issues with a cluster is that the software that has to coordinate [operations] across all of the servers is really complicated for the database and operating system vendors. It would be much easier if you could run it all on a single server and we are actually going back that way," says Feinberg.

HP, for example, recently released "Hana Hawk", a 12TB memory single server, while Oracle’s M6 is a 32TB single-server. The IBM Power-8 is smaller, but will soon come in versions of up to 128-cores and 128 or more terabytes of memory.

"So we are starting to go back to single servers. Why? They are easier to maintain; they are cheaper to maintain – because instead of 10 servers I’ve got one – but the software that runs on it does not have to use such an overhead coordinating multiple servers in a cluster. My prediction is that we are cycling back from horizontal, clustered servers to single servers," believes Feinberg.