Plant Sciences services

Name Description ELIXIR Node

Apple is one of the most famous fruits globally and occupies a central position in folklore, culture, and art. Apple cultivars have retained high genetic and phenotypic diversity, evidenced by the high number of apple varieties cultivated today. The economic and cultural importance of apple has driven efforts to catalogue and exploit this genetic diversity, but few of these data are currently integrated into ELIXIR resources. We propose a data implementation study to integrate the high quality apple reference genome and its associated catalogue of genetic diversity, representing the most widely cultivated apple varieties around the world. We will use apple as a case study for managing the growing number of ‘multi-genome’ fruit projects, testing and where necessary, improving tools to streamline data import and exchange between ELIXIR supported resources, specifically BioSamples, ENA, EVA, ORCAE and Ensembl Plants.

ELIXIR Italy, ELIXIR Belgium, EMBL-EBI

Over the coming decade, Europe will face critical challenges in maintaining biodiversity, ensuring food security and combating pathogens. Our 2024–28 Programme will address these issues by mobilising and integrating molecular data, using successful coordination models from human genomics. Through strategic investments and collaboration in externally-funded projects, ELIXIR will enhance scientific services and support transnational research in these essential areas.

The following projects have been selected as part of the ELIXIR 2024–28 Programme’s Biodiversity, food security and pathogens Science Tier:

  • E-PAN: Enhancing pan-genome analysis in plants
  • FAIRyMAGs: Optimising Metagenomics Assembled Genomes building: workflow finalisation, training material development, real data evaluation and resource allocation tool creation
  • HARVEST: Handling and alignment of plant research FAIRification – value through the use of ELIXIR data Standards and Tools
  • Odyssey: Connecting molecular and geographical biodiversity data

With the declining cost of genome sequencing, the focus of plant researchers is shifting towards characterising the wide genomic diversity present within a species. Crop pan-genomes consist of the sequencing, comparison and integration of multiple different genomes from the same agriculturally important species such as wheat, rice and potatoes. Exploiting the information encoded within these pan-genomes can lead to the development of new cultivars more resilient to upcoming challenges like increased drought and heat stress. 

Multiple consortia are independently generating and integrating these pan-genomes, but there is currently little progress in streamlining and homogenising these efforts. While sequence quality is no longer a major issue, the completeness of both assembly and subsequent gene annotation are much harder to correctly quantify, while being the major drivers in explaining the adaptive differences between genotypes. Where there are efforts to visualise and browse pan-genomes, for example by using graph representations, the easy retrieval of gene Presence Absence Variation information or structural rearrangements is currently lacking, hampering knowledge learning. 

E-PAN aims to streamline the efforts of different research groups within the ELIXIR Plant Science Community. This encompasses the development of effective standards, computational pipelines and tutorials to assess the quality of pan-genomes and provide solutions to identified problems. We will also evaluate and integrate different approaches for data visualisation and browsing, which will be used by different partners sharing pan-genomics results. A one-day meeting and an online workshop will be organised to disseminate results and initiate new collaborative projects. These concerted efforts will lead to a standardised approach to be used in future pan-genome projects, a reduction in duplication efforts across consortia, and a set of tools to visualise and mine pan-genomics results.

Nodes involved: ELIXIR Belgium, ELIXIR Germany, ELIXIR Portugal, ELIXIR Slovenia, ELIXIR UK
Communities: Plant sciences

Metagenomics Assembled Genomes (MAGs) are crucial for understanding biodiversity, enhancing food security and combating pathogens by providing insight on uncultured and unexplored genomes. This proposal outlines a comprehensive project aimed at advancing metagenomics research through the advancement, optimisation, evaluation and dissemination of robust FAIR workflows for building MAGs. 

Leveraging the Galaxy platform, our primary objectives include finalising a user-friendly state-of-the-art Galaxy workflow tailored for MAG construction, and ensuring its accessibility and reusability through integration with WorkflowHub. To support user adoption and proficiency, we will create FAIR educational materials hosted on the Galaxy Training Network (GTN), empowering researchers with the skills necessary to use the workflow effectively. 

The efficacy of the developed workflow will be rigorously evaluated by analysing MAGs generated from simulated and real-world data-spanning diverse environments: atmosphere, marine and cow gut microbiomes. This evaluation will provide valuable insights into the workflow's performance and its applicability across different sample types, complexities and ecosystems.

We will also investigate the computational resources required for executing the assembly step of the workflow using data provided by several Galaxy servers and the MGnify team on various input datasets. The aim would be to optimise resource allocation to ensure efficient and cost-effective MAGs construction. A novel tool will be developed to facilitate this process, allowing researchers to accurately estimate and allocate resources for each step of the assembly pipeline. 

By addressing these objectives, our project aims to accelerate metagenomics research by providing researchers with a comprehensive and accessible framework for MAGs construction. This framework will not only streamline the workflow for building MAGs but also facilitate reproducibility, collaboration and innovation within the ELIXIR Microbiome Community.

Nodes involved: ELIXIR France, ELIXIR Germany, ELIXIR Italy, EMBL-EBI
Communities: Galaxy, Microbiome

The standardisation and accessibility of plant data is a major challenge for agricultural research. MIAPPE, which was developed as part of the transPLANT and ELIXIR-EXCELERATE projects, has made a decisive contribution to unifying data capturing. Also, the FONDUE Implementation Study facilitated the integration of phenotypic and genotypic data. 

Nevertheless, challenges persist in achieving full FAIRness of plant data. The development of guidelines and best practice documents within the Commissioned Service INCREASING has improved this. However, further enhancements are required, such as providing additional documentation and reference datasets. 

To address these needs, it is important to assess the practical effort required to FAIRify datasets using MIAPPE, ISA, ARC and RO-Crate standards. The idea is to provide biologist-friendly data documentation and at the same time  introduce machine-actionable formats for bioinformaticians to use. A further challenge arises from the scattered nature of the information, as there is no single resource on which all the information is collated. 

In HARVEST, we aim to address these challenges by FAIRifying datasets (DROPS, AGENT) using the latest version of MIAPPE as a basis, which now covers more diverse and complex use cases. This process will include enriching the MIAPPE documentation in particular with example datasets, updating training material and refining mappings to other interoperable formats such as BrAPI, Bioschemas and ISA-Tab/JSON. We will also establish links using FAIDARE to repositories such as EMBL-EBI EVA, e!DAL-PGP, recherche.data.gouv and Zenodo, to enhance data sharing and reuse opportunities. An extension of the RDMkit Plant Sciences pages will be implemented to serve as a primary hub for information on FAIRification of plant data. Furthermore, we will be consolidating resources and improving accessibility through direct linking to the original web resources and recipes, also adding Jupyter notebooks to the FAIR Cookbook where possible.

Nodes involved: ELIXIR Germany, ELIXIR France, ELIXIR Netherlands, ELIXIR UK, EMBL-EBI
Communities: Plant Sciences

Understanding molecular biodiversity is essential for ecological conservation and sustainable development. While a vast array of molecular data awaits exploration, its lack of connectivity with other sources of data and metadata such as geographical reference, habitat, population size and phenotypic data often pose significant barriers to biodiversity research.

This project proposal is about developing Odyssey, a web portal in the form of a user-friendly interface that will allow researchers, educators and citizens to navigate the world of molecular biodiversity using Greece and Norway as case studies – two countries with a characteristic and unique wealth of biodiversity, representative for Mediterranean and Nordic types of ecosystems respectively. 

Based on existing sources of information and prototype applications available for specific regions and taxa, this project aims to link actual efforts and develop a new interface to offer diverse functionalities for data exploration and analysis, such as descriptive statistics, graphs, maps, customisable data filters and dynamic visualisations. Through modular design, the application will ensure flexibility and scalability, enabling easy integration of new data sets and analytical tools in the future. This approach will be used for training and communication, inviting traditional biodiversity research groups to utilise new information concerning the spatial patterns of biodiversity and their connection with features that are important for designing conservation measures, such as habitat connectivity, representativity, population demographics, dynamics of adaptation and migration.

Odyssey’s outcome will be a valuable tool for studying and, ultimately, offering a basis for managing and conserving the rich molecular biodiversity of Greece and Norway, as well as supporting the activities of the ELIXIR Biodiversity Community in the two Nodes and in Europe. This will promote collaboration, innovation and knowledge exchange in biodiversity research and beyond. 

This new tool will be developed and offered under an open-source licence, encouraging community participation and contribution to further enhance its capabilities and broaden its applications, fostering a robust network for biodiversity research in Greece and Norway.

Nodes involved: ELIXIR Greece, ELIXIR Norway
Communities: Biodiversity

ELIXIR Belgium, ELIXIR France, ELIXIR Germany, ELIXIR Greece, ELIXIR Netherlands, ELIXIR Norway, ELIXIR Portugal, ELIXIR Slovenia, ELIXIR UK, EMBL-EBI, ELIXIR Italy

The ELIXIR Plant Sciences Community has been implementing a distributed infrastructure for FAIR plant genotype-phenotype data publication and access, which aims to support agronomic research and industrial development.

This infrastructure is based on a central search service, FAIDARE, which sits atop a federation of distributed data repositories across several ELIXIR Nodes, all of which implement a common web service specification, the Breeding API (BrAPI, https://brapi.org/). BrAPI ensures accessibility, and also interoperability and reusability because it implements the MIAPPE metadata standard (https://www.miappe.org/), whereas FAIDARE ensures findability.

While the need for data FAIRness and the solutions of the ELIXIR Plant Sciences Community to enable it have been gaining traction in academia, their penetration in the industry has been almost null. Our goal is that industry stakeholders not only make use of the publicly available data in ELIXIR’s infrastructure, but also can deposit their (meta)data in that infrastructure or even implement their own BrAPI endpoints. Realizing this goal requires outreach activities, to divulge ELIXIR’s plant data infrastructure and FAIR-enabling standards, and training activities on how to use the infrastructure and standards.

The Navigator Company a leading force in the international pulp and paper market and one of Portugal's strongest brands on the world stage. Its production structure is based on three major industrial sites in Cacia, Figueira da Foz and Setúbal, where the facilities set international standards for the pulp and paper industry. In addition to its industrial activities, it carries out, mostly through RAIZ Forest and Paper Institute, extensive research on Eucalyptus breeding and genetics, generating genotypic and phenotypic data on over 300,000 specimens across a range of sites and covering up to 4 generations of pedigree. This wealth of data makes The Navigator Company a prime candidate for a pilot knowledge-transfer project to enable it to draw value from and contribute to ELIXIR’s plant data infrastructure.

The goals of this project are (1) to transfer knowledge on standards for FAIR plant data access and publication (particularly BrAPI) from the ELIXIR Plant Sciences Community to the Navigator Company; (2) to collaborate with The Navigator Company in organizing its data on eucalyptus breeding according to the MIAPPE and BrAPI standards; and (3) to establish an access protocol for the Navigator Company to submit its datasets to the ELIXIR-PT BrAPI end-point in bulk.

The accomplishment of these goals will lead to the expansion of the plant datasets provided by the ELIXIR-PT BrAPI end-point to the Plant Sciences community, and more importantly, will bring a key industrial partner into the fold of FAIR plant data publication. Furthermore, due to the prominent role of The Navigator Company in Europe, we expect this project to play a key outreach role and pave the way to further collaborations with other partners in the industry.

ELIXIR Portugal

The aim of this Implementation Study is to determine the requirements for validation with ELIXIR partners, to build prototype open validation services for archetype archival databases and knowledge bases, in particular:

  • Content validation according to minimum information checklists.
  • Syntactic format validation according to a standard format in conjunction with the GA4GH file formats team as part of the Large Scale Genomics Workstream.
  • Syntactic format validation for Phenotyping data.
  • Semantic validation according to a publicly available ontology.
ELIXIR Belgium, ELIXIR France, EMBL-EBI, ELIXIR UK
ELIXIR Belgium, ELIXIR Cyprus, ELIXIR Czech Republic, ELIXIR Denmark, ELIXIR Estonia, ELIXIR Finland, ELIXIR France, ELIXIR Germany, ELIXIR Greece, ELIXIR Hungary, ELIXIR Ireland, ELIXIR Israel, ELIXIR Italy, ELIXIR Luxembourg, ELIXIR Netherlands, ELIXIR Norway, ELIXIR Portugal, ELIXIR Slovenia, ELIXIR Spain, ELIXIR Sweden, ELIXIR Switzerland, ELIXIR UK, EMBL-EBI
ELIXIR Belgium, ELIXIR Cyprus, ELIXIR Czech Republic, ELIXIR Denmark, ELIXIR Finland, ELIXIR France, ELIXIR Germany, ELIXIR Greece, ELIXIR Hungary, ELIXIR Israel, ELIXIR Italy, ELIXIR Netherlands, ELIXIR Norway, ELIXIR Portugal, ELIXIR Slovenia, ELIXIR Spain, ELIXIR Sweden, ELIXIR Switzerland, ELIXIR UK, EMBL-EBI
ELIXIR Belgium, ELIXIR Netherlands

ELIXIR is about integration of diverse resources including tools, training materials and technical services. Within EXCELERATE, ELIXIR is building portals to collate information on tools and data services (bio.tools), training events and material (TeSS, WP11 e-learning environment), compute resources (WP4 technical service registry) and cross-linked policy, standards and databases (FAIRsharing, WP4). A focus of EXCELERATE is to set up these portals such that they can interoperate.

Currently, a scientist can use TeSS to find training events and materials and then, in a separate search, use bio.tools to find relevant tools, and FAIRsharing to find standards and databases. At the moment these ELIXIR portals provide a useful, but fragmented service.  Ideally, linking TeSS and bio.tools to ELIXIR’s computer resources via common workflow diagrams would enable end-users to discover and learn about the prevalent bioinformatics workflows. In this implementation study, we want to achieve the first step and link TeSS and bio.tools via most prevalent bioinformatics workflows and lay the foundation to later incorporate other ELIXIR platforms, such as the compute resources, to provide an even more useful service for the researcher.

The goal of this implementation study is to provide the life-scientist end-user with a powerful tool to find and use ELIXIR resources - across the spectrum - based on intuitive graphical diagrams of the most prevalent scientific workflows.

ELIXIR UK, ELIXIR Estonia, ELIXIR Belgium, ELIXIR Denmark, ELIXIR Switzerland, EMBL-EBI, ELIXIR Norway, ELIXIR France

Over the past four years, the ELIXIR Plant Sciences Community has been making large strides towards enabling FAIR plant phenotyping data: the ELIXIR plant data search service FAIDARE (https://urgi.versailles.inra.fr/faidare/) addresses findability by integrating the various BrAPI (https://brapi.org) end-points of ELIXIR Nodes, which address accessibility and ensure compliance with the MIAPPE metadata standard (https://www.miappe.org/) and therefore interoperability.

Combined, these resources represent a fully FAIR-compliant data management framework. However, there is one final critical hurdle impeding its broad adoption by plant scientists: there is no standardized user-friendly way to submit a dataset to a BrAPI end-point (or more precisely the database underlying it).

The goal of this project is to develop a web interface for MIAPPE-compliant data submission that can be deployed by any plant phenotyping database.

This interface will be modular, including an interactive web form for metadata entry mirroring the organization of MIAPPE, web services for key functionalities such as ontology lookup and validation of MIAPPE compliance, and modules for database entry that upload the data to a database. For the latter, we will develop a module that makes use of BrAPI PUT calls to upload data directly through a BrAPI endpoint, but also a module for uploading the data through FAIRDOM’s SEEK platform, which is already being deployed by some partners.

The project is expected to substantially benefit the sustainability and increase the adoption of the data management framework put together by the ELIXIR Plant Sciences Community, which includes key ELIXIR services such as FAIDARE. Furthermore, the project will both build capacity and increase collaboration between ELIXIR Nodes on data management, and enhance interactions between Node experts contributing to the Plant Sciences Community.

ELIXIR Portugal, ELIXIR Netherlands, ELIXIR France, ELIXIR Belgium

Recent progress in sequencing technologies has produced several large scale genotyping data sets for crops. The insights afforded by this data have been published in high profile scientific articles, but the underlying raw genotype data and the associated sample and population metadata have not been routinely submitted to appropriate archives.

The aim of this implementation study, led by the ELIXIR Plant Community and in coordination with the ELIXIR Interoperability Platform and Data Platform, is to provide this wealth of data according to FAIR principles. It will ensure an interoperable link with the phenotypic data that is stored in distributed institutional repositories which is crucial for excelerated crop breeding.

We propose to create a sustainable toolbox to submit data to the ELIXIR Deposition Database “European Variation Archive” (EVA) and enrich the data with interoperable metadata regarding plant data standards like “Multi-Crop Passport Descriptor” (MCPD) and “Minimum Information About a Plant Phenotyping Experiment” (MIAPPE).

ELIXIR France, ELIXIR Germany, ELIXIR Belgium, ELIXIR Netherlands, EMBL-EBI
ELIXIR France

The Plant Sciences Community has already implemented some critical elements of its roadmap, but needs some funding to coordinate the next steps.

The first point is about disseminating ELIXIR results through reusable training material and service bundles. The target audience will be biologists, agronomists and bioinformaticians involved in data production and analysis.

The second point is about improving data findability. We propose to specify a European one-stop portal giving access to plant data and tools in collaboration with the data platform. It will leverage existing ELIXIR resources: aggregation databases dedicated to plants (FAIDARE, Ensembl Plant, InterMine) and tools and standards collections (FAIRsharing, bio.tools).

Last, the gathering and formatting of data and metadata increasingly relies on community driven toolboxes such as FAIRDOM/Seek, COPO and ISA-Tools. There is an opportunity to improve their interoperability through aligned validation profiles and API use, hence easing submission to ELIXIR Databases.

ELIXIR France, ELIXIR Belgium, ELIXIR Germany, EMBL-EBI, ELIXIR Greece, ELIXIR Italy, ELIXIR Netherlands, ELIXIR Portugal, ELIXIR Slovenia, ELIXIR UK
ELIXIR France