Opendata, web and dolomites

Report

Teaser, summary, work performed and final results

Periodic Reporting for period 1 - CloudDBAppliance (European Cloud In-Memory Database Appliance with Predictable Performance for Critical Applications)

Teaser

The vision of CloudDBAppliance project is to produce a cloud database appliance able to reproduce the performance and resilience of mainframes in cloud data centers. Today mainframes only support transactional operational workloads. For analytical workloads and business...

Summary

The vision of CloudDBAppliance project is to produce a cloud database appliance able to reproduce the performance and resilience of mainframes in cloud data centers. Today mainframes only support transactional operational workloads. For analytical workloads and business analytics, a different kind of system is used, such as data warehouse and Hadoop data lakes. However, this current approach is expensive and complex since it requires copying the data from the operational database into the data warehouse. CloudDBAppliance works on an appliance with both operational & analytical capabilities, with the ability to perform enhanced analytics over the operational data.

CloudDBAppliance project aims at designing a European Cloud Database Appliance to provide a Database as a Service able to match the predictable performance, robustness and reliability of on-premises architectures, such as those based on mainframes. CloudDBAppliance will enhance cloud architectures to promote the usage of cloud technology, by providing the robustness, reliability, and performance required of applications currently considered too critical to be deployed on existing clouds.

A cloud database appliance shall be delivered, featuring:
1. A vertically scalable in-memory operational database for time-critical applications that can process high update workloads, such as the ones processed by banks or telcos, as well as a fast analytical engine delivering real-time data, to answer analytical queries over operational data in an online manner.
2. An operational Hadoop data lake that enables the use of Hadoop analytics frameworks over operational data.
3. A hardware appliance that leverages the next generation of hardware to be designed by Bull, the main European HPC hardware provider. This hardware will be a many-core architecture with 120 cores and 32 TB of main memory. This hardware will go beyond that of mainframes by enabling in-memory processing on big data.
Both the operational database and the in-memory analytics engine will be optimized to fully exploit this hardware and deliver predictable performance. CloudDBAppliance will deliver high resilience by withstanding catastrophic cloud data centers failures (e.g. fire or natural disaster) through providing data redundancy across appliances located on different cloud data centers.

Work performed

The work performed from the beginning of the project to the end of the first reporting period (mid-term, end of the 18 months) is in line with the original plan:
Concerning the technical part, the global architecture and evaluation plan were specified. The consortium partners developed different components and tested their individual components. A partial integration of cross components was performed on a platform that was made available to the partners. Some algorithms necessary to the project were designed; the implementation of an initial version of all the algorithms was completed.
The five planned use cases have been defined and designed in order to take advantage of the technological components. Moreover, the functionalities of the first prototype version of all use cases has been developed.
The dissemination and communication activities are highlighted on the project public website, http://clouddb.eu/ and on social media.

Final results

The progress beyond the SOTA in the technology will be:
• The delivery of a solution facilitating the running of critical applications on the cloud, by providing a unique solution for real-time big data analytics and supporting the operational database and analytics over the operational data, benefiting from a unique very large memory server, BullSequana S, scaling up to 32 TB by the end of the project.
• A scale-up in-memory operational database that makes it possible to scale vertically and efficiently in many-core/large memory architectures with 100s of cores. It will leverage an ultra-fast logging solution. It will bring an innovative architecture mimicking that of distributed systems, but within a many-core architecture to avoid the contention of current databases when executed on many-core machines.
• An analytic engine leveraging the large amount of memory available within a single server to enable in-memory analytical processing of large data sets on a single computer. Running this engine, a Java program, on such a large machine is a challenge by itself: JVM works comfortably at Gigabytes scale. Successfully and predictively running a JVM on multi Terabytes of memory will require to take advantage of many innovative concepts such as NUMA affinity and off-heap memory management.
• Both the operational database and the analytical engine will be optimized to exploit at best BullSequana’s optimized IO and the new NVDRAM, 3dpoint and NVMe storage devices to provide durability with minimal impact in latency and improve NUMA efficiency.
• The removal of the boundaries between OLTP, OLAP and CEP technologies: it will deliver in real-time: analytical queries, calculations, and streaming analytics. It will be unique in its ability to deliver all types of data processing over large amounts of operational data updated at very high rates.


These technological progresses will make it possible to enhance our selected use cases beyond SOTA:
• Stock trading risk assessment: with powerful capabilities to deal with huge result sets in-memory, which will allow to do fast and flexible near-real-time aggregations and computation of non-linear figures. This opens a whole new set of opportunities for real-time capital estimation – banks can now evaluate in real-time what-if scenarios and the impact of VaR or CVA changes without doing full revaluation.
• Mobile phone number migration: CloudDBAppliance will provide a technological solution for a national central database for cell number portability, capable of providing service to a large country with a number of cell phone users of up to 100 millions.
• Proximity marketing: CloudBiz and IKEA will provide a proximity marketing solution for the IKEA stores that will exploit in real-time all information on user behaviour and itinerary, so as to extract insights and make real-time offerings as the customers walk through the IKEA stores.
• Real-time pricing: The CloudDBAppliance platform will enable the adoption of a radically new approach to set prices in real-time by analysing data related to the shopping behaviour of customers. A product pricing strategy will be set in real-time to optimally match spatio-temporal customer shopping trends.
• ATM operations: The CloudDBAppliance platform will gather information from a variety of data sources and will perform multi-level analysis and optimization techniques in order to i) accurately model ATM user behaviour based also on external factors of variable behavior, such as weather forecast or social events and ii) predict the expected ATMs cash flows.

Website & more info

More info: http://clouddb.eu/.