Opendata, web and dolomites


A programming language bridging theory and practice for scientific data curation

Total Cost €


EC-Contrib. €






Project "Skye" data sheet

The following table provides information about the project.


Organization address
postcode: EH8 9YL

contact info
title: n.a.
name: n.a.
surname: n.a.
function: n.a.
email: n.a.
telephone: n.a.
fax: n.a.

 Coordinator Country United Kingdom [UK]
 Project website
 Total cost 1˙995˙181 €
 EC max contribution 1˙995˙181 € (100%)
 Programme 1. H2020-EU.1.1. (EXCELLENT SCIENCE - European Research Council (ERC))
 Code Call ERC-2015-CoG
 Funding Scheme ERC-COG
 Starting year 2016
 Duration (year-month-day) from 2016-09-01   to  2021-08-31


Take a look of project's partnership.

# participants  country  role  EC contrib. [€] 
1    THE UNIVERSITY OF EDINBURGH UK (EDINBURGH) coordinator 1˙995˙181.00


 Project objective

Science is increasingly data-driven. Scientific research funders now routinely mandate open publication of publicly-funded research data. Safely reusing such data currently requires labour-intensive curation. Provenance recording the history and derivation of the data is critical to reaping the benefits and avoiding the pitfalls of data sharing. There are hundreds of curated scientific databases in biomedicine that need fine-grained provenance; one important example is GtoPdb, a pharmacological database developed by colleagues in Edinburgh.

Currently there are no reusable methodologies or practical tools that support provenance for curated databases, forcing each project to start from scratch. Research on provenance for scientific databases is still at an early stage, and prototypes have so far proven challenging to deploy or evaluate in the field. Also, most techniques to date focus on provenance within a single database, but this is only part of the problem: real solutions will have to integrate database provenance with the multiple tiers of web applications, and no-one has begun to address this challenge.

I propose research on how to build support for curation into the programming language itself, building on my recent research on the Links Web programming language and on data curation. Links is a strongly-typed language that provides state-of-the-art support for language-integrated query and Web programming. I propose to build on Links and other recent language designs for heterogeneous meta-programming to develop a new language, called Skye, that can express modular, reusable curation and provenance techniques. To keep focus on the real needs of scientific databases, Skye will be evaluated in the context of GtoPdb and other scientific database projects. Bridging the gap between curation research and the practices of scientific database curators will catalyse a virtuous cycle that will increase the pace of breakthrough results from data-driven science.


year authors and title journal last update
List of publications.
2020 Ghita Berrada, James Cheney, Sidahmed Benabderrahmane, William Maxwell, Himan Mookherjee, Alec Theriault, Ryan Wright
A baseline for unsupervised advanced persistent threat detection in system-level provenance
published pages: 401-413, ISSN: 0167-739X, DOI: 10.1016/j.future.2020.02.015
Future Generation Computer Systems 108 2020-03-13
2019 Bartha, Sándor; Cheney, James
Towards meta-interpretive learning of programming language semantics
published pages: , ISSN: , DOI:
1 2020-02-18
2017 Wilmer Ricciotti, James Cheney
Strongly Normalizing Audited Computation
published pages: , ISSN: , DOI: 10.4230/LIPIcs.CSL.2017.36
26th EACSL Annual Conference on Computer Science Logic, CSL 2017 2020-01-29
αCheck: A mechanized metatheory model checker
published pages: 311-352, ISSN: 1471-0684, DOI: 10.1017/s1471068417000035
Theory and Practice of Logic Programming 17/03 2020-01-29
2017 James Cheney, Jeremy Gibbons, James McKinna, Perdita Stevens
On principles of Least Change and Least Surprise for bidirectional transformations.
published pages: 3:1, ISSN: 1660-1769, DOI: 10.5381/jot.2017.16.1.a3
The Journal of Object Technology 16/1 2020-01-29
2018 Jan Stolarek, James Cheney
Language-integrated provenance in Haskell
published pages: , ISSN: 2473-7321, DOI: 10.22152/
The Art, Science, and Engineering of Programming 2 10 2020-01-29
2018 Stefan Fehrenbach, James Cheney
Language-integrated provenance
published pages: 103-145, ISSN: 0167-6423, DOI: 10.1016/j.scico.2017.08.009
Science of Computer Programming 155 2020-01-29
2017 Wilmer Ricciotti, Jan Stolarek, Roly Perera, James Cheney
Imperative functional programs that explain their work
published pages: 1-28, ISSN: 2475-1421, DOI: 10.1145/3110258
Proceedings of the ACM on Programming Languages 1/ICFP 2020-01-29
2017 Weili Fu and Roly Perera and Paul Anderson and James Cheney
muPuppet: A Declarative Subset of the Puppet Configuration Language
published pages: 12:1--12:27, ISSN: , DOI: 10.4230/LIPIcs.ECOOP.2017.12
31st European Conference on Object-Oriented Programming, ECOOP 2017 2020-01-29
2018 Rudi Horn, Roly Perera, James Cheney
Incremental relational lenses
published pages: 1-30, ISSN: 2475-1421, DOI: 10.1145/3236769
Proceedings of the ACM on Programming Languages 2/ICFP 2020-01-29

Are you the coordinator (or a participant) of this project? Plaese send me more information about the "SKYE" project.

For instance: the website url (it has not provided by EU-opendata yet), the logo, a more detailed description of the project (in plain text as a rtf file or a word file), some pictures (as picture files, not embedded into any word file), twitter account, linkedin page, etc.

Send me an  email ( and I put them in your project's page as son as possible.

Thanks. And then put a link of this page into your project's website.

The information about "SKYE" are provided by the European Opendata Portal: CORDIS opendata.

More projects from the same programme (H2020-EU.1.1.)


Dynamic Modeling of Labor Market Mobility and Human Capital Accumulation

Read More  

ECOLBEH (2020)

The Ecology of Collective Behaviour

Read More  


The Mass Politics of Disintegration

Read More