Opendata, web and dolomites

Report

Teaser, summary, work performed and final results

Periodic Reporting for period 1 - XDC (eXtreme DataCloud)

Teaser

The eXtreme DataCloud (XDC) project develops scalable technologies for federating storage resources and managing data in highly distributed computing environments. The provided services are capable of operating at the unprecedented scale required by the most demanding, data...

Summary

The eXtreme DataCloud (XDC) project develops scalable technologies for federating storage resources and managing data in highly distributed computing environments. The provided services are capable of operating at the unprecedented scale required by the most demanding, data intensive, research experiments in Europe and Worldwide. The targeted platforms for the released products are the already existing and the next generation e-Infrastructures deployed in Europe, such as the European Open Science Cloud (EOSC), the European Grid Infrastructure (EGI), and the Worldwide LHC Computing Grid (WLCG). XDC is run by a Consortium that brings together technology providers with a proven long-standing experience in software development and large research communities belonging to diverse disciplines: Life Science, Biodiversity, Clinical Research, Astrophysics, High Energy Physics and Photon Science. The project grounds its roots on technologies already developed by the partners or previous H2020 initiatives (such as the INDIGO-DataCloud project). XDC software is released as Open Source and is based on already existing components that the project enriches with new functionalities and plugins. The use of standards and protocols widely available on the state-of-the-art distributed computing ecosystems guarantees that the released components can be easily plugged into the European e-Infrastructures and in general on cloud based computing environments. The project is investing in the direction of enhancing the user experience in accessing the data management services, providing friendly, web-based user interfaces ready for the mobile world. The final goal is to significantly lower the access barriers to distributed computing, release more usable, more reliable, functionality-rich and still scalable data management services to cope with the most demanding scientific use cases. Moreover, the nature of the distributed e-infrastructures is changing due to the advent of the virtualization techniques and cloud computing paradigms, the resources once identified as “sites” have become “liquid” and highly dynamic. XDC services are designed to cope with this new nature of the distributed e-infrastructure, providing solutions to support the dynamic extension of computing centers to remote locations or the usage of sites with limited storage capacity maintaining transparent bi-directional access to the data stored in all the locations. XDC opens new possibilities to scientific research communities in Europe supporting the evolution of e-Infrastructure services for Exascale data resources. The driving force of the project developments relies on real life requirements provided by the Research Communities represented in the Consortium, however the topics and challenges addressed are of general interest for bigger and smaller user communities. The main general impact expected for the project is an increased uptake of the European e-Infrastructures.

Work performed

\"The project has firstly laid out the necessary governance and technical structures to organize the work. Since the beginning of the project XDC has emphasized a User-Driven approach. The first months of activities have been oriented to understand correctly the different requirements coming from the Use Cases, taking into account their specific characteristics in terms of data lifecycle management. Based on a complex requirement collection work, the project has then defined its technical architecture focusing the developments activities in the integration of highgly performant components in the areas of data storage, data transfer, data federation and data orchestration. The architecture is built upon the interaction between components, using standards whenever possible, rather then on a monolithic and proprietary model. XDC put a lot of effort in the Software Quality Assurance (SQA) for its products to strengthen the software quality during the development phase aiming at releasing high-quality software. Instead of providing a unique and monolithic solution, the XDC software is a coherent set of components that enables integration and interoperability between existing frameworks.
The development activities led to the first public release of the project on January 25th 2019. It was codenamed XDC-1/Pulsar addressing important topics like federation of storage resources, smart caching solutions, policy driven, data management based on Quality of Service, data lifecycle management, metadata handling, optimized data management based on access patterns.
The XDC project partners have been very active in disseminating the project with actions carried out hand-in-hand with the communication staff to maximize the project impacts. The key dissemination activities include:
- Creation of a service catalogue: the XDC service catalogue contains the description of the services, and the new related functionalities, developed and enhanced during the project activities.
- Organising training courses and workshop: a Summer Course was organized jointly with DEEP-Hybrid DataCloud project, it took place in Santander in June2018, under the umbrella of the Menendez Pelayo International University and was titled “New challenges in Data Science: Big Data and Deep Learning on Data Clouds\"\". A second workshop towards the EOSC communities is planned in 2019.
-Participating conferences and events : a list of general events eligible for XDC dissemination has been collected.
-Participation to the European Commission’s Common Dissemination Booster (CDB). This framework allows to receive advices and guidance by joining a professional consultancy team on the analysis of the project challenges, identification of the project results and target groups, and marketing strategies.
The next months of the project will mainly be focused to refine the released version, add features not present in current release and requested by scientific communities, disseminate the results of the project, exploit the provided services making them available in production infrastructures, and continue to interact and interoperate with other relevant projects, industries and public or private organizations.
\"

Final results

The XDC proponents identified technological gaps in the current e-Infrastructures concerning data management services that are being filled up by the new functionalities that XDC is releasing. Providing advanced features at the infrastructure level, the XDC architecture reduces the effort needed to port complex workflows and computing models into the distributed systems. XDC opens new possibilities to scientific research communities in Europe supporting the evolution of e-Infrastructure services for Exascale data resources. The driving force of the project developments relies on real life requirements provided by the Research Communities (belonging to different scientific domains) represented in the Consortium, however the topics and challenges addressed are of general interest for bigger and smaller user communities (including private sector actors) that rely (or would like to rely) on distributed e-Infrastructures for their computing models. The main general impact expected for the project is an increased uptake of the European e-Infrastructures by the Research Communities and SMEs due to the availability of services that will address their needs in a scalable, reliable, functionality-rich and user friendly way.

Website & more info

More info: http://www.extreme-datacloud.eu.