Human Copy Number Variation (hCNV) services
Name | Description | ELIXIR Node |
---|---|---|
The initial 2019-2021 hCNV community implementation study employed a set of perceived needs to a) deliver first community standards and procedures; b) identify intersections with other ELIXIR communities and stakeholders in ELIXIR connected organizations, such as GA4GH; and c) to streamline priorities for relevant, achievable deliveries of hCNV community projects. This project for an hCNV implementation study focuses on those potential high-value targets for data access and delivery, using reference resources and community stakeholder engagement to directly implement and test hCNV resources aligned with ELIXIR ecosystems. |
ELIXIR Switzerland, ELIXIR France, EMBL-EBI, ELIXIR Spain, ELIXIR UK, ELIXIR Germany | |
Cellular and molecular biology are fundamental to ELIXIR's mission. As part of our 2024–28 Programme, we are committed to advancing data services and software for research on nucleic acids, proteins and other biomolecules. This initiative will address new demands for multi-omics and multi-modal analyses, including imaging, by developing methods and partnerships. We will also expand expertise in reusable data and software to incorporate FAIR models, ensuring robust solutions for modelling at all scales. The following projects are key to connecting the latest developments with established data resources, unlocking the potential of cellular and molecular biology:
This project addresses the limitations of current ontologies in capturing the dynamic nature of disordered protein regions by pursuing several primary objectives. Firstly, novel structural and functional ontologies will be developed to accurately represent the structural heterogeneity and dynamic functional annotations of proteins. These ontologies will incorporate timescales, annotating the kinetics of structural transformations to elucidate molecular mechanisms and regulatory pathways governing protein dynamics. Collaborating with existing databases and consortia will ensure seamless integration of ontological resources and experimental data, fostering interoperability and accelerating discoveries. A standardised file format specification will also be developed in collaboration with the Human Proteome Organisation Proteomics Standards Initiative, facilitating the encoding of structural state transitions within disordered protein regions. This specification will enhance data interoperability and exchange among research groups and databases, providing a common language for describing structural transitions and advancing our understanding of the functional implications of protein dynamics in biological systems. Nodes involved: ELIXIR Belgium, ELIXIR Hungary, ELIXIR Italy, EMBL-EBI This project aims to strengthen the basis for a one-stop shop connecting databases, datasets and tools for the deployment of the engineering Design-Build-Test-Learn (DBTL) framework in biotechnology. It will do so by surveying the tools and data landscape, pinpointing gaps and opportunities, and establishing design patterns for task-specific workflows for analysis, integration and sharing of multimodal data. It will provide a resource that will allow users to navigate the complex landscape of biotechnology tooling and data, as well as to establish solutions that fit their specific DBTL requirements. Use cases from ongoing programmes in various communities will be used to ascertain and establish the pragmatic value of the solutions. The work will be carried out through hands-on activities, dedicated workshops and hackathons, providing training and resources, as well as fostering industrial engagement. The experience of the communities and platforms involved in systems biology, industrial biotechnology, metabolic modelling, metabolomics, enzymes, bioprospecting and data management will be particularly valuable in this respect, as well as their respective industrial relations. Accordingly, the project engages participants from seven ELIXIR nodes and connects researchers and their activities from six communities. The project outcomes will contribute to advancing the ambition of connecting the latest developments and established data resources across ELIXIR to realise the potential of cellular and molecular biology, particularly in the fields of industrial biotechnology and biomanufacturing. Nodes involved: ELIXIR Spain, ELIXIR Greece, ELIXIR France, ELIXIR Netherlands, ELIXIR Portugal, ELIXIR Slovenia, ELIXIR UK Spatial transcriptomics (ST) was named ‘Method of the Year 2020’ by Nature Methods and was more recently featured in Nature’s Seven technologies to watch in 2024. ST is now a prerequisite for researching transcriptional pathology at the cellular and molecular levels. Current use of ST is ubiquitously applied to multiple pathologies, including neurodegenerative disease, cancer, cardiomyopathy and nephrology. There is also an emerging application of ST in plant and microbiome research. While there are a plethora of spatial analysis applications, these are not unified or easily manageable by research scientists and they lack any hope of delivering FAIR and reproducible results. To address this challenge, we will implement Spatial2Galaxy (S2G) – a self-contained, reproducible, scalable FAIR spatial transcription analysis platform for researchers and bioinformaticians alike. We will develop S2G based on our success with developing Galaxy workflows, training materials and ST and single-cell analysis pipelines. S2G will provide state-of-the-art ST tools and workflows with proven high performance in benchmarking studies, ensuring the uptake of best practices. These tools will be demonstrated on datasets that connect various ST databases. This will consolidate community guidelines for integrative multi-modal single-cell omics and imaging analysis. Compared to non-spatial single-cell sequencing, presented as the Nature ‘Method of the Year 2013', it took six years until practical training and workflows for its analysis were FAIRified and available in Galaxy by 2019. In contrast, S2G aims to reduce this gap between technologies becoming relevant and provision of FAIR resources to the life science community for ST. Nodes involved: ELIXIR Germany, ELIXIR France, ELIXIR Netherlands, ELIXIR UK The ELIXIR metabolomics community relies on standards, formats and data treatment solutions development and adoption, but it remains challenging to ensure high-quality reported metadata, sufficiently contextualised results, interoperable and reusable datasets and to integrate these metabolomics data with other omics or studies. This project is designed to address these issues and aims to connect key international standards with ELIXIR resources, as well as creating associated community guidelines and training materials. Based on the FAIRification framework, activities in the project will: i) increase interoperability and reuse of public metabolomics datasets and workflows through enhanced and extended open data standards, resources and new semantic annotations, ii) define, ensure and establish quality control for study baselines in Metabolomics and Exposomics, and iii) facilitate metabolomic data interpretation and meta-analysis integration with multi-omics and systems biology studies. As a first necessary step, the project will create a Semantic Metabolomics Data Model to standardise metadata, ensuring unambiguous reuse of metabolomics projects. This model will focus on integrating key ontologies, providing open training initiative and enhancing the interoperability of metabolomics data through the production of open guidelines for annotation steps. By linking with ELIXIR’s Deposition databases, ISA Framework and other services, the project seeks to boost interconnection with ELIXIR platforms, other ELIXIR communities (Systems Biology, Food and Nutrition, Galaxy, Proteomics, Toxicology, Research Data Alliance Focus Group ...), the FAIR Cookbook and BioSchemas.org communities. Project outcomes are expected to promote the emergence of ambitious and innovative semantic-based solutions for inter-comparison of studies in healthcare, clinical and plant domains. Nodes involved: ELIXIR Czech Republic, ELIXIR Germany, ELIXIR Italy, ELIXIR Spain, ELIXIR France, ELIXIR Netherlands, ELIXIR Sweden, ELIXIR UK, EMBL-EBI |
ELIXIR Belgium, ELIXIR Czech Republic, ELIXIR France, ELIXIR Greece, ELIXIR Hungary, ELIXIR Italy, ELIXIR Netherlands, ELIXIR Portugal, ELIXIR Slovenia, ELIXIR Spain, ELIXIR Sweden, ELIXIR UK, EMBL-EBI | |
ELIXIR Belgium, ELIXIR Cyprus, ELIXIR Czech Republic, ELIXIR Denmark, ELIXIR Estonia, ELIXIR Finland, ELIXIR France, ELIXIR Germany, ELIXIR Greece, ELIXIR Hungary, ELIXIR Ireland, ELIXIR Israel, ELIXIR Italy, ELIXIR Luxembourg, ELIXIR Netherlands, ELIXIR Norway, ELIXIR Portugal, ELIXIR Slovenia, ELIXIR Spain, ELIXIR Sweden, ELIXIR Switzerland, ELIXIR UK, EMBL-EBI | ||
ELIXIR Belgium, ELIXIR Cyprus, ELIXIR Czech Republic, ELIXIR Denmark, ELIXIR Finland, ELIXIR France, ELIXIR Germany, ELIXIR Greece, ELIXIR Hungary, ELIXIR Israel, ELIXIR Italy, ELIXIR Netherlands, ELIXIR Norway, ELIXIR Portugal, ELIXIR Slovenia, ELIXIR Spain, ELIXIR Sweden, ELIXIR Switzerland, ELIXIR UK, EMBL-EBI | ||
This Study's work will address the following themes:
|
ELIXIR France, ELIXIR Switzerland, ELIXIR Germany, EMBL-EBI, ELIXIR Spain, ELIXIR Netherlands, ELIXIR Norway, ELIXIR Hungary, ELIXIR Slovenia, ELIXIR UK | |
Human data and translational research is a high priority for ELIXIR and builds on the progress made in the previous programmes by the Human Data Communities. Within the Science Tier of the ELIXIR 2024–2028 Programme, advances will be focussed on enabling researchers (including research clinicians) to use ELIXIR’s infrastructure, for human genomic, phenotypic, imaging and demographic data to support discovery, analysis, innovation and integration of research findings into the clinic and healthcare. More specifically, through these projects we will ensure that millions of human genomes are discoverable and exploited in a biomedical setting through ELIXIR-supported infrastructure and community-endorsed standards, software, workflows and analysis environments across ELIXIR Nodes. On Data Deposition:
On Federated Data Analysis:
On Linking Data:
Theme: Data DepositionThe Federated European Genome-Phenome Archive (FEGA) network is an ELIXIR-supported infrastructure for making human genomic data discoverable and accessible across ELIXIR Nodes. This project seeks to accelerate data depositions into FEGA, which will significantly increase the data flow in and from FEGA nodes. In alignment with the goals of the Human data and translational research Tier of the ELIXIR 2024–2028 programme, this project will promote seamless data integration and increase global researchers’ confidence in the data stored within FEGA, thus strengthening the network's position as a trusted resource for genomic data. It will build capacity within the FEGA Nodes and increase awareness among a wide range of stakeholders, thus altogether achieving the ultimate goal of enhancing data reuse. The project will be carried out by a strategic consortium comprising seven ELIXIR Nodes and two ELIXIR Communities. Partners represent four FEGA nodes at different levels of maturity, a member of the Cancer Data Community and both institutions managing Central EGA. The proposal is formulated around five timely coordinated tasks where all partners contribute their expertise to the final outcomes, converging in the deposition of several datasets to different nodes, testing the new tools and metadata model and blueprinting deposition of high-quality FAIR data in the future. Nodes involved: ELIXIR Switzerland, ELIXIR Spain, ELIXIR France, ELIXIR Norway, ELIXIR Portugal, EMBL-EBI Theme: Data DepositionHuman data, especially genomic data, is increasingly being federated across borders and institutions, with many stakeholders participating in multinational and global biomedical and health data networks, fostering collaborations and partnerships. While such international efforts are essential for the compilation and reuse of data, regulatory constraints often hinder the movement of certain data beyond organisational or national boundaries. Centralised approaches such as the Central European Genome-Phenome Archive (CEGA) are valuable, but not all data can be centralised. The Federated European Genome-phenome Archive network (FEGA) addresses this, with early work concentrated on local collection of data with central archiving of metadata. FHDportal aims to support both federated and central submission of metadata. It will do this by providing a reusable portal for gathering and storing metadata at a national level, and submitting required metadata centrally to enable discovery of datasets via the CEGA. FHDportal complements the existing system by providing a way to explore richer metadata (for example, including detailed information on specific datasets or local funding information), while enabling a core set of metadata to be queried centrally. FHDportal will be deployed and tested on FEGA nodes, and should be of interest to the many other countries seeking to join FEGA. The need for FHDportal is based on experience during onboarding and in moving to production nodes. It will offer a common solution for local mobilisation of data and metadata, which can be adapted to local situations. During development, it will be tested on both new and well-established nodes using different technical platforms and infrastructures. The resulting software will be provided to the whole community, and will hopefully become part of the emerging toolkit for new FEGA nodes wishing to establish themselves, and to ensure their nodes meet local needs while bringing European scale benefits. Nodes involved: ELIXIR Switzerland, ELIXIR Finland, ELIXIR Luxembourg, ELIXIR UK Theme: Federated Data AnalysisFederated analysis (FA) revolutionises genomics research by enabling collaborative analysis across distributed datasets, while safeguarding data privacy and facilitating comprehensive insights into genetic diseases. Federated access and analysis of human datasets is part of the ELIXIR scientific program. ELIXIR is also involved with the EUCAIM (European Cancer Imaging Initiative project, and coordinates the European Genomic Data Infrastructure (GDI) project, which aims to provide federated access to 1+M whole genome sequences (WGS). While the GDI project explores federated solutions to analyse its data, it does not foresee deploying FA solutions for evaluation. This project seeks to implement FA across four ELIXIR Nodes, using synthetic and real, publicly accessible Genome-Wide Association Studies (GWAS) data. To maximise the impact of this proposal, we plan to leverage the developments already made in the context of the EUCAIM project, specifically the orchestration solution around the Flower Framework and the ongoing developments in the FA, in the context of the Staff Exchange BRIDGE between ELIXIR and DCEG/NIH, where Yjs framework is the chosen solution. We also aim to represent the analysis using RO-Crates to track the provenance of the analysis, following the Five Safes Framework. The proposal is built around ongoing collaborations on deploying and testing FA solutions for analysing sensitive data across different projects like GDI, EUCAIM, BY-COVID and TRE-FX. This project aims not only to boost this interaction using the Flower Framework for FA, but also to strengthen the connection to NIH/DCEG through dataset sharing and comparing different FA frameworks. All Nodes involved in this project are active members of the ELIXIR Human Data Communities, especially the Federated Human Data and Cancer Data ones. The outcomes derived from this project will be disseminated not only to these Communities but also to all ELIXIR projects where this topic is relevant. Nodes involved: ELIXIR Belgium, ELIXIR Spain, ELIXIR France, ELIXIR Portugal, ELIXIR UK Theme: Federated Data AnalysisThrough the 1+Million Genomes (1+MG) initiative, Europe is scaling up efforts to build a shared framework and infrastructure to safely access and integrate clinical human data across borders, following regulatory efforts like the General Data Protection Regulation (GDPR) and the European Health Data Space (EHDS). These are pivotal in safeguarding sensitive information, while enabling authorised access for researchers, healthcare professionals and other actors. Integral to biomedical data security considerations are the European Genome-Phenome Archive (EGA), in both Central and Federated forms, recognised as the predominant European repository for the secure storage of pheno-clinical and genomics data. Mobilising data for secure analysis in Virtual Research Environments (VREs) remains challenging. Indeed, it is an active focus in ongoing projects like the European Genomic Data Infrastructure (GDI), EOSC-ENTRUST and EOSC4Cancer. Galaxy is a popular open-source, community-driven VRE for bioinformatics analysis that represents a unique platform for developing and testing novel strategies for data analysis. A prototyping strategy for the access and processing of sensitive data was demonstrated in a previous ELIXIR implementation study (2021–2023). By adopting GA4GH Crypt4GH encryption standard features, we enabled Galaxy users within Trusted Research Environments (TREs) to decrypt sensitive data for workflow execution without sharing private encryption keys. We propose expanding this prototype into a comprehensive solution for secure data analysis in Galaxy, facilitating encrypted data access and transfer from FEGA/EGA repositories to designated TREs, all interactively orchestrated by the users on a public Galaxy server. The proposed solution offers flexibility with different levels of enforced restrictions ranging from scenarios with no limitations on encrypted data transfer and storage, to fully federated analysis scenarios, where analysis occurs near the data. Most of the required infrastructure can also be deployed independent of Galaxy, simplifying the potential implementation of these concepts in other VREs. Nodes involved: ELIXIR Belgium, ELIXIR Germany, ELIXIR Spain, ELIXIR Norway Theme: Linking DataToday, research generates more data than ever, and a multitude of experimental data types. Such data types are often connected at source: perhaps generated from the same samples or as part of the same study. It is important that different data types are made available for re-use in a linked and coordinated manner, enabling full reuse of all the data in integrated analysis. Experimental data types are often siloed in varied specialised repositories, using different metadata models, so linking them is not straightforward. Also, data obtained from living humans is sensitive and shared under a controlled access model, adding an extra layer of complexity. In this project, partners will establish a strong foundation for developing solutions to integrate multi-omic sensitive data effectively among FEGA nodes, biobanks and ELIXIR Core Data Resources such as PRIDE and GWAS Catalog. Five ELIXIR Nodes will be involved, as well as the Polish FEGA node (in-kind contribution) from two ELIXIR Communities (Federated Human Data and Proteomics), spanning three diverse data use cases to address the challenges of this open call. The project will start by developing a comprehensive landscape analysis of current human data linkage challenges and solutions (Task 1). Based on this, concrete models and prototypes will be proposed to link sensitive proteomics data (Task 2), cohorts and biobank data (Task 3), and population cohort-derived data (Task 4) to genomics data. Results from tasks 2 to 4 will be used to improve the FEGA metadata model. The project will result in more coherent data deposition, discoverability and retrieval of multi-omics datasets, providing FAIRer data, and accelerating research. To facilitate broad engagement, the project will engage the ELIXIR Communities through dedicated online and in-person events, where both interim and final results of the project will be disseminated. Nodes involved: ELIXIR Finland, ELIXIR Germany, ELIXIR Spain, ELIXIR Sweden, EMBL-EBI |
ELIXIR Belgium, ELIXIR Finland, ELIXIR France, ELIXIR Germany, ELIXIR Luxembourg, ELIXIR Norway, ELIXIR Portugal, ELIXIR Spain, ELIXIR Sweden, ELIXIR Switzerland, ELIXIR UK, EMBL-EBI | |
ELIXIR France, ELIXIR Germany, ELIXIR Hungary, ELIXIR Netherlands, ELIXIR Norway, ELIXIR Slovenia, ELIXIR Spain, ELIXIR Switzerland, ELIXIR UK, EMBL-EBI | ||
The ELIXIR human Copy Number Variation Community (hCNV) was created in December 2018. In two years contributions to the field have been numerous (ELIXIR IS, Rare Diseases, Federated Human Data, Beacons, GA4GH, EJP-RD and Beyond 1 Million Genomes - B1MG). The Community now aims to address the major challenge of NGS data interpretation in the era of whole genome sequencing: Copy Number Variation. During the first commissioned service offered as a starting grant, the Community has identified various gaps to proceed with CNV tools benchmarking and in particular for Exome and targeted sequencing, which are by far the most widely used technologies in diagnostic laboratories and in research. Within this implementation study we want to provide solutions and bioinformatic infrastructure solutions to fill identified gaps, and to make these biomedical reference materials available (i.e. via Open Science) to the various communities and platforms. |
ELIXIR France, ELIXIR UK, ELIXIR Switzerland, ELIXIR Spain, ELIXIR Germany | |
ELIXIR Switzerland, ELIXIR UK | ||