Opendata, web and dolomites

Report

Teaser, summary, work performed and final results

Periodic Reporting for period 1 - SODA (Scalable Oblivious Data Analytics)

Teaser

More and more data is being generated, and analyzing this data drives knowledge and value creation across society. Unlocking this potential requires sharing of (often personal) data between organizations, but this meets unwillingness from data subjects and data controllers...

Summary

More and more data is being generated, and analyzing this data drives knowledge and value creation across society. Unlocking this potential requires sharing of (often personal) data between organizations, but this meets unwillingness from data subjects and data controllers alike. Hence, techniques that protect personal information for data access, processing, and analysis are needed. To address this, the SODA project will enable practical privacy-preserving analytics of information from multiple data assets using multi-party computation (MPC) techniques. For this data does not need to be shared, only made available for encrypted processing. The main technological challenge is to make MPC scale to big data, where we will achieve substantial performance improvements. We embed MPC into a comprehensive privacy approach, demonstrated in an ICT-14.b and a healthcare use case.

Our first objective is to enable MPC for big data applications by scaling the performance. We follow a use case-driven approach, combining expertise from the domains of MPC and data analytics. Our second objective is to combine these improvements with a multidisciplinary approach towards privacy. By enabling differential privacy in the MPC setting aggregated results will not leak individual personal data. Legal analysis performed in a feedback loop with technical development will ensure improved compliance with EU data privacy regulation. User studies performed in a feedback loop with our consent control component will make data subjects more confident to have their data processed with our techniques. Our final objective is to validate our approach, by applying our results in a medical demonstrator originating from Philips practice and in a use case arising from the ICT-14.b data experimentation incubators. The techniques will be subjected to public hacking challenges. The technical innovations will be released as open-source improvements to the FRESCO MPC framework.

Work performed

Overall, SODA made good progress to achieve the targets of the project objectives. Particularly for the objectives to enable MPC on big volume and varieties of data great results were achieved as demonstrated by a large number of scientific publications at top-tier conferences. These WP1 and WP2 contributions improve efficiency and performance of MPC computation and communication, which make MPC more usable for data analytics.

WP1 “Cryptographic Protocols” produced work of outstanding quality, which is recognized by the international research community: many of the results of this WP have been accepted for publications at the top-tier conferences for the field of cryptography.

WP2 “MPC-based analytics on Big Data” completed D2.1 “State of the art” and made a number of contributions on methods and algorithms. From this, the work on secure linear algebra, secure comparison, and Conclave present more fundamental contributions, whereas the work on DNA matching and neural networks is more applied. Research into streaming algorithms started and is ongoing.

WP3 “Privacy: technical, legal, and user experience” main progress is on legal and user aspects. Progress on technical aspects relates to state of the art research performed and initial research in combining MPC with differential privacy. D3.1 “General Legal Aspects” focuses on the techniques being developed within the frames of this project and provides a comprehensive overview on the most relevant legal challenges with an emphasis on how de-identification techniques affect the applicability of the GDPR. D3.2 provides a plan for how the user studies will be conducted and explain the idea behind the methodology.

WP4 “Demonstration” is ongoing with preparations for the demonstrators and proof of concept implementations. Significant result in this reporting period for WP4 is D4.1 “Requirements for demonstrators” that covers a number of real-world MPC use cases. The release of FRESCO 1.0, years after v0.1 in Dec 2015, is a very significant result that incorporates many new MPC techniques and functionality, has improved maturity and performance, and is much easier to use by developers. The first release of MPyC is a third significant result that brings MPC closer to non-experts. Furthermore, a number of independent proof of concept and prototype implementations were created.

WP5 “Project and innovation management” is on track with the project running as planned, without significant deviations and in good atmosphere. D5.1 “Web based internal communication platform”, D5.2 “Public website” and D5.3 “Initial dissemination and exploitation plan” have been delivered on schedule. Progress with respect to execution of dissemination and exploitation is on track both from a project perspective as well as for the individual partners. Significant results include scientific publications with a large number of publications of which many at top venues, software with the release of FRESCO v1.0 and MPyC, workshops with organization of TPMPC 2018 and DMSC, and intellectual property with 3 patent applications. Steps towards business uptake is also ongoing, e.g. in Philips through discussions with various businesses and departments.
Regarding impact, the expected impact is still in line with the original plan. This is consistent with progress on dissemination and exploitation. The same applies to the exploitation and dissemination plan itself, which is stable and concretized in D5.3. The data management plan remains unchanged with the note that D6.1 “Ethics approval” details this further for the user studies in WP3.

Final results

Progress beyond state of the art is made across the work packages with some notable outcomes in the first half of the project. WP1 has been very successful in improving the state of the art in the field of general purpose MPC protocols, which is demonstrated by many scientific publications at top-tier conferences and workshops. WP2 delivered a number of special purpose algorithms enabling MPC-based big data analytics, including for example oblivious evaluation of neural networks and a big-data MPC query processing framework. WP4 provided new releases of the FRESCO MPC framework, which includes support for novel methods and protocols like SPDZ2k, as well as several other proof of concept and prototype implementations.

Expected results by the end of the project are as follows. WP1 will continue to deliver general and special purpose MPC protocols to provide the cryptographic backend of the project objective, i.e. scalable oblivious data analytics. Similarly, WP2 continues with results targeting streaming and machine learning algorithms. WP3 aims at delivering a leakage control component that combines differential privacy with MPC, a legal (GDPR, etc.) analysis study of MPC in medical setting, and a user study with guidance for technical development. WP4 will deliver demonstrators and organize a challenge to improve confidence in MPC through real-life scenarios. WP5 will progress on exploitation according to the exploitation plans.

Website & more info

More info: http://www.soda-project.eu.