Opendata, web and dolomites

Report

Teaser, summary, work performed and final results

Periodic Reporting for period 1 - PanCanRisk (Personalized bioinformatics for global cancer susceptibility identification and clinical management)

Teaser

Cancer is the most common cause of death in economically developed countries, and the second leading cause of death in developing countries, as nearly three million people are diagnosed with cancer in the European Union every year. The global burden of cancer increases mostly...

Summary

Cancer is the most common cause of death in economically developed countries, and the second leading cause of death in developing countries, as nearly three million people are diagnosed with cancer in the European Union every year. The global burden of cancer increases mostly due to the aging and growing of the population. A significant proportion of the worldwide burden of cancer could be prevented through early detection and treatment, and health campaigns on exercise and diet.
Several global initiatives to analyse cancer genomes have been launched over the last years, including The Cancer Genome Atlas and the International Cancer Genome Consortium. These two initiatives have lead to a successful genomic characterization of thousands of cases of cancer in dozens of cancer types. The spectrum of mutations present in the most common cancer types has systematically been reported in different studies, leading to the identification of new emerging biological and clinical insights in cancer. Through these studies we now have achieved a quite comprehensive view of the mutational landscape of human cancer and the achieved information can be used in immediate translational actions at the clinical level, affecting interventions in molecular diagnosis, therapeutic clinical trials and cancer monitoring. This knowledge has many implications for early cancer diagnosis, molecular characterisation, drug sensitivity evaluation and pharmacogenomics, with a direct impact on clinical management of patients.
In this project we propose to develop and apply the bioinformatics tools necessary to fully exploit the available cancer datasets and to translate the diagnostic power of sequencing into day-to-day clinical practice. The PanCanRisk consortium combines complementary expertise in bioinformatics and statistics, as well as the experimental and clinical capability, to achieve a successful translation of computational discoveries of novel variants that affect cancer susceptibility. Our work program will have immediate translational impact reaching clinical practice.
The objectives are:
To develop a computational platform for cancer risk tests with application in population-based cancer risk screening programs
To scale studies to pan-cancer data of all tumour types, to identify cancer-associated mutations.
To identify regulatory risk variants and characterize their functionality.
To translate sequencing panels of cancer predisposition genes to cohorts of patients for clinical diagnostic purposes
To develop and evaluate biomarkers of cancer susceptibility and clinical course identified to provide higher cancer risk prediction

Work performed

In WP1 we have achieved to set up a distributed NGS analysis pipeline that is able to analyse large-scale cancer cohorts. eDiVA pipeline includes modules for NGS read alignment, variant prediction, quality control, filtering, functional annotation and prioritization of familial risk variants. A novel disease knowledge database and several new prediction and quality control tools have been implemented and integrated in eDiVA. We have benchmarked multiple variant prediction tools and established a best practice protocol for calling variants .

In WP2 we have performed rare variant association analysis on germline data from specific cancer types. In particular, we have obtained some significantly associated genes which will be validated in the follow-up steps. In addition, we have identified some relevant genes and non-coding regions associated specifically to Breast Cancer risk and Pancreatic cancer, and have generated a list of candidate genes and regions to be validated in our follow-up analysis.

In WP3, we have developed and applied new computational approaches for the identification of regulatory variants. We have derived methods that allow for integrating genetic and molecular variation data types to test for genetic effects and for identifying context-specific effects. We have also developed computational and statistical methods for identifying regulatory variants using epigenetic readouts.

For WP4, we have gathered around 6,000 cancer cases and controls for the validation series. We have established protocols for the efficient recruitment of high and low risk breast and bowel cancer patients and started the recruitment process for the validation stage.

In WP5, we have achieved generating knock-in clones in MCF-10A cells, which allow us to be confident of our methodology to generate the clones for this project at a rate of 2/ 3 weeks without the need of pre-screening.
We have as well used sCD biomarker data from a previous study to develop an Algorithm for the diagnosis of breast cancer disease samples from controls.

WP7 is dedicated to management and coordination. During this first period we have established project website, developed of a project brand (logo, communication plan, and presence in social networks…), organized 3 meetings and scientific symposium, supported teleconferences on a regular basis and advanced in the compliance of the ethical requirements and data protection.

The development of the project will at all times comply with the tenets of the national and EU regulations regarding ethical issues and privacy of data.

Final results

Cancer susceptibility variants have not been studied extensively at a genome wide scale. We aims advancing the knowledge on the landscape of susceptibility variants and genes across several cancers. We have the goal to mine the available pan cancer panels and to replicate the resulting candidate genes in an independent large cohort of cancer patients.
This effort requires new methods that need to be equally suited for the analysis of large cohorts of patients, as well as for the application in clinical settings. We propose to advance the bioinformatics algorithms included in this platform in multiple ways: a) new statistical methods for the identification of rare susceptibility variants, b) algorithms for the interrogation of regulatory variants, and c) integrative analysis of susceptibility variants with a broad range of molecular and clinical variables. Hence, we will further advance the use of innovative sequencing based methods for the experimental validation of computational predictions. The proposal has also a substantial collaborative component to maximally exploit these data.
PanCanrRisk has a clear bench-to-bedside component, in which we will engage in the bioinformatics discovery of the susceptibility variants and will also validate them in large cohorts to assess their implementability as risk biomarkers. This will be complemented by the implementation of intuitive computational tools for the day-to-day clinical setting. We will also evaluate the commercialization of innovations introduced in the software of the platform. This could be achieved by means of establishing service models or by direct sale of software to the clinical markets.
Complementary to software commercialization, we expect to gain commercially relevant insights from the biomarker analysis.

In combination, these innovations have the potential to change the way cancer risk is assessed in the clinics, to significantly expand the set of known risk genes and pathways, and to provide a better understanding of their involvement in cancer susceptibility.

Website & more info

More info: http://www.pancanrisk.eu/.