Deploying Reproducible Containers and Workflows Across Cloud Environments

The study will convene and establish a consensus on high-level community-driven standards:

  • Workflow / Task Orchestration Service: A minimal API specification that will support heterogenous containerised workflow (e.g. CWL, Galaxy, Nextflow, etc.) workloads for secured execution across the ELIXIR Federation of compute/cloud sites.
  • Tool / Workflow Registry Service: A minimal API specification that will provide access to curated heterogeneous container formats (e.g. Docker, Singularity) and workflow specifications (e.g. CWL, Galaxy, Nextflow, etc.) to be used as part of the workflow orchestration service.
  • Data Repository Service: A minimal API specification that will support the discovery and secured access to ELIXIR Core Data Resources and ELIXIR Node provided datasets as part of the workflow orchestration service.
  • Data and Workflow Security Protocols: Embedding security relying on ELIXIR AAI across all ELIXIR APIs to ensure secure access to data, tools and workflows to allow analysis to be performed on sensitive data.

This will be achieved by coordinating the expertise in the ELIXIR Platforms (Compute & Tools) and work taking place within the Nodes and related projects (e.g. EOSC-Life, EOSC-Hub), and will be broken down into three work packages: 

  1. Leveraging EOSC-Life workflows infrastructure
  2. ELIXIR Infrastructure for Orchestrating Containers and Workflows
  3. Coordinating ELIXIR Data Discovery and Transfer Services

and a number of Community Lead Use Cases: 

  1. Human data requiring security protocols provided by AAI, and data transfer services
  2. Single cell transcriptomics workflow that could be adapted for different organisms
  3. Metabolomics workflow adopted by the PhenoMeNal project
  4. Proteomics workflow aligning with the IS around benchmarking workflows to ensure reproducible research and potential clinical applications