Notes given in the application form
Eligibility criteria
- Must be an ELIXIR Service (i.e. be part of an existing ELIXIR Node’s Service Delivery Plan, or is ELIXIR commissioned work), or is in the official process/commitment of becoming one. (Required)
- Must have evidence that it supports an interoperability activity, and has been deployed. (Required)
- Must support or be forecast to support FAIR Principles. (Required)
- Should fit into, or be forecast to fit into, the EIP roadmap for data interoperability or other activities relevant to ELIXIR mission.
Additional notes
- Please complete this form by adding information for your Interoperability Resource in the appropriate section below. Consult with Recommended Interoperability Resource (RIR) selection criteria documentation on details for each section below.
- Where a panel/question is not relevant to your Interoperability Resource, please leave it blank or mark as “not applicable”, optionally with a brief explanation as to why.
- Word limit guidance is noted for free text fields.
- Please include urls to external resources, where useful.
- Any questions, contact Sirarat Sarntivijai (sirarat.sarntivijai@elixir-europe.org).
1. Resource facilitation to scientific research
a. Interoperability Resource: Briefly describe the function of the Interoperability Resource
The open source ISA metadata tracking framework facilitates standards-compliant collection, curation, management, publication and reuse of an increasingly diverse set of life science, environmental and biomedical experiments that employ one or a combination of technologies.
Built around the ‘Investigation’ (project context), ‘Study’ (unit of research) and ‘Assay’ (analytical measurement) general-purpose model, ISA Tools help users to provide a rich description of the experimental metadata (i.e. sample characteristics, technology and measurement types, sample-to-data relationships) such that the resulting data, associated methods, code and discoveries are findable, reproducible and reusable. ISA Tools offer many serializations, including tabular, RDF and JSON.
b. Scope statement: describe the scope , and the users of the resource. How is the Interoperability Resource positioned with respect to other similar Interoperability Resources ? Include the base URL and, if relevant, the introductory or “about” page URL.
Started in 2003 and first released in 2007, the ISA encompass a number of open source, stand-alone tools developed over time by the Oxford team and collaborators or directly contributed by partnering contributors.
The components help users to:
1. Collect and curate, following standards: describe the experimental steps using community-defined minimum reporting requirements and ontologies, where possible.
2. Store and browse, locally or publicly: create your own repository to search and browse the experimental description and associated data, hosted openly or privately.
3. Submit to public repositories: when required, reformat experiments for submission to supported public repositories (such as ArrayExpress, Pride and ENA) or directly export to those already using ISA formats (as as MetaboLights).
4. Analyse with existing tools: upload experimental descriptions and associated data to a growing number of well-known analysis systems that ISA formats connect with.
5. Release, reason and nanopublish: explore and reason over your experiments, open them to the linked data universe, or publish nano-statements of your discoveries.
6. Publish data alongside your article: directly export your experiments to a new generation of data journals that are accepting submissions in ISA formats.
c. Resource url
http://isa-tools.org/index.html
d. Inter-organisational recognition: does the Interoperability Resource have community recognition? (e.g. demonstrated through a collaboration, geographical diversity in the source of the submissions, international diversity of delivery partners and/or funders)
To foster continued collaboration and ensure that the model, its serializations and the tools are elements of a larger infrastructure network, the ISA Commons was launched in 2012 (Sansone, Rocca-Serra et al., Toward interoperable bioscience data, Nature Genetics, 2012): https://www.isacommons.org. It represents the growing international community of over 40 groups, projects, and organisations that use and contribute to the development of components of the ISA metadata tracking framework - to facilitate standards-compliant collection, curation, management and reuse of datasets. The ISA Commons community encompassed a variety of domains ranging from -omics, cell-based research, biomedical nanotechnology, plant phenotyping, toxicology, biodiversity, metagenomics, stem cell research, system biology, neuroscience, microbial science and immunology. This successful network has produced an ecosystem of public and internal resources powered by one or more component of the ISA framework.
Grass-root standards groups have also leveraged the ISA model, illustrating the flexibility of the format and the versatility of the software components. An example is the ISA-Tab-Nano, an extension to the ISA tabular version by the Nanotechnology Working Group (Nano WG) of the US NIH National Cancer Informatics Program (Baker et al, Standardizing data, Nature Nanotechnology, 2013); this is now a formal ASTM standard: https://www.astm.org/Standards/E2909.htm
Under EXCELERATE Interoperability WP5, the ISA framework is already recognized by and connected with the ELIXIR EBI and NL Nodes, as well adopted by the Metabolomics and Plant Use Case communities (for the MIAPPE standard and BrAPI).
2. Community
a. Community impact: If applicable, provide documented evidence of community impact (e.g., publication citations, API calls, projects using the resource, etc.)
Measures of the community impact of the ISA framework are the number, as well as the caliber of the resources that use the ISA model and/or tools; these encompass: (i) local, institute- based (in academia and industry), (ii) project, consortium-based, as well as (iii) global, international repositories listed at: http://www.isacommons.org. Exemplars are: the resource of biology and space-related datasets by the National Aeronautics and Space Administration (NASA, USA)’s GeneLab team, and the MetaboLights repository for metabolomics data at the EMBL-EBI.
The international ISA Commons user base ranges from hundreds to thousands of researchers from increasingly diverse domains, and goes beyond researchers, curators, others resource developers and service providers, to also include journals. For example, ISA is used by the University of Oxford’ GigaScience and underpins Springer Nature’s Scientific Data data journal, supporting intelligent data sharing and credit. Specifically ISA is used to describe the experiment and to provide browse and search functionality for Scientific Data’s content (http://scientificdata.isa-explorer.org).
The ISA framework is currently embedded in a number of UK, EC and NIH and pharma funded infrastructure and research project to Prof. Sansone. Relevant to ELIXIR are: the Wellcome Trust-funded project to interconnect ISA with InterMine, another ELIXIR-UK resource, to reward researchers for annotating and publish FAIR data; the BBSRC-funded COPO infrastructure for plant science, embedded in the ELIXIR Plant community; and the EU-funded H2020 PhenoMeNal project, an European network on large scale computing for medical metabolomics, embedded in the ELIXIR Metabolomics community.
b. Potential usage: Describe other systems that could use this candidate resource, but currently do not.
As ISA is open source, other tools and systems could reuse, extend and/or interconnect to one or more elements of the ISA framework.
For example, most major research projects rely on a range of techniques and data acquisition modalities to phenotypes living system and obtain insights into biological process. Therefore, the need to extensible systems that remain compatible with existing formats is crucial. With the recent uptake of ISA-Galaxy tools (https://github.com/ISA-tools/isatools-galaxy) and integration with the Galaxy Framework, ISA has struck a major milestone by showcasing how prospective data management can be done, demonstration a full deposition workflow to the EMBL-EBI Metabolights repository.
c. Outreach & support: Provide resource support publication(s)/user documentation(s) describing the Interoperability Resource (e.g. scientific journal publications, community preprints, resource user’s documentations etc.), resource dissemination plan (e.g. workshops, conference presentations), and other equal-opportunity research support (if applicable).
Outreach and support is delivered and provided by the operational team in Oxford, in collaboration with the ISA Working Group, which includes many international researchers and developers (https://www.isacommons.org), via GitHub, the isaforum@googlegroups.com email, and the @isatools twitter account.
ISA is regularly presented by the members of the ISA operational team, and major news and updates are also presented on the ISA blog: http://isa-tools.org/blog/index.html
It is not possible to list all the peer reviewed publications on the different components of the ISA and the resources powered, due to the high number and space limit in this form. However we want to highlight the “The future of metabolomics in ELIXIR. F1000Res. 2017” by members of the ELIXIR Metabolomics Community.
Also Jupyter notebooks (https://github.com/ISA-tools/dtp-isa-exercises) have been developed as teaching material to showcase the use of ISA-API in various context. The notebooks have been successfully used in several undergraduate and postgraduate courses on data readiness in Oxford, as well as at data hackathon organized and run by the ISA operational team.
ISA format will become a native Galaxy data type as part of the September release of the Galaxy tool. Support in the Galaxy community is being developed, in the form a “Galaxy tour” https://github.com/ISA-tools/isatools-galaxy/blob/develop/tours/isacrea… (Rocca-Serra P et al. ISAcreate Galaxy tool for prospective data management with ISA format support – application to metabolomics datasets: 10.7490/f1000research.1115757.1)
d. Dependency of other resources: How is this resource critical to the user(s)? Do other resources depend on the resource described here to provide downstream service? Please list, or provide a link to a diagram.
A number of ISA-powered systems depend on one or more ISA components. Listing them all is not possible, however, here we highlight key exemplars close to the ELIXIR community:
- The EMBL-EBI MetaboLights relies on ISA model, ISA serialization and ISA API for a number of critical services in production environment. The new web-based submission interface to the european repository of metabolomics data relies on ISA-JSON format for building web component and relies on the ISA-API to validate, convert experiments represented in ISA objects;
- BBSRC-funded COPO infrastructure, part of the ELIXIR-UK Node, relies on the ISA API, ISA-JSON serialization and on the ISA configurations to support plant based experiment molecular profiling experiments. It also used the ISAconverter to deposit to EMBL-EBI european nucleotide archives.
- ELIXIR-UK Node partners, University of Birmingham and Imperial College London use ISA Galaxy Tools, ISA-API and ISA validator - as part of their work in the UK Phenome Centre - to collect data prospectively but also organise public deposition to repositories
- The ELIXIR Plant Community’s MIAPPE standards and BrAPI rely on availability of ISA parsers and validation tools in the context of data validation programs, also part of an ELIXIR Implementation Study.
3. Quality of resource
a. Uptime: Average percentage uptime/month during the last 12 months, response time of the resource. In case of ontology/standards production, interval of update/release, adaptability of ontology design patterns to evolving data. Provide information where applicable: uptime of resource, software release cycle (please state week/month etc), update frequency.
ISA, since its inception, has been developed as an open source project, taking full advantage of GitHub code repository infrastructure. From the core code sharing functionality, to website hosting and documentation serving, ISA has demonstrated outstanding robustness and operational stability, by relying on industry strength GitHub repository.
From the software engineering aspect, ISA developers have chosen to ‘release early, release often’ approach thus encouraging early evaluation and code reviews by interested partners leading to rapid identification of bottlenecks, improvements and bug fixes. The bug tracking (issue tracker), continuous integration frameworks, release schedule and pacing, as well as communication among developers and contributors has been facilitated by the integration with plugins and extension such waffle.io, slack and more recently gitter.im.
b. Accessibility: what are resource retrieval mechanisms? Does the resource provide web-based user interface, application programmable interface (API), containers, and/or other channels? Please list resource access mechanism, provide URLs as applicable.
ISA resource is code base and a set of software components, accessed via a dedicated GitHub code repository https://github.com/ISA-tools.
For some component, the ISA operational team maintains services to allow evaluations. For example, an instance for ISA REST API (https://github.com/ISA-tools/isa-rest-service) is available from the University of Oxford’s Oxford e-Research Centre hosting services (http://frog.oerc.ox.ac.uk/api/spec.html#!/spec)
ISA components “in action” can be accessed from partners projects such as the EMBL-EBI MetaboLights or Nature Springer’s ISA Scientific Data ISAexplorer.
c. Maintenance quality: Is there a maintenance SOP or plan, reflecting sustainability and scalability? Does it align with guidelines for sustainable software development? Please include a resource commitment statement (description text or URL).
Since a key member of the ISA operational team (Dr. Alejandra Gonzalez-Beltran) is also a fellow of the Software Sustainability Institute, best practices (code available via public github repository, continuous integration via Tracis CI, monthly release, unit tests, distribution as bioconda container, and python package) have been applied throughout.
Furthermore, several ISA components have also been through and passed the User Acceptance Testing carried out by industrial users and collaborators at Janssen Research, The Novartis Institutes for BioMedical Research and the FDA’s Center for Bioinformatics a NCTR.
d. Support quality: Please list support mechanisms (e.g., point of contact, request ticketing, resource’s response time where a solution is identified, etc.), and methods to collect user feedback. If available, list tutorial documentations or tutorial materials and format, including linking on the ELIXIR’s Training Portal (TeSS) (or other training platforms) where applicable.
The ISA website, the related GitHub repository, the email list (achieved on googlegroup), the YouTube channel, and the twitter account serve as point of contacts and support. The response time to questions and issues is usually a day or less.
Some ISA-powered resources have material deposited TeSS, e.g.: https://tess.elixir-europe.org/materials/metabolights-quick-tour#home.&…;
Here we highlight some other key support material:
- Readthedoc sites for 2 main components (ISA model & ISA API): https://isatools.readthedocs.io/en/latest; https://isa-specs.readthedocs.io/en/latest/
- Python package: https://pypi.org/project/isatools/#history
- Distribution as bioconda container: https://anaconda.org/bioconda/isatools
4. Legal framework, funding, and governance
a. Legal framework: What are the resource’s license/terms of use? Can the license facilitate Open Science? Please include the url for the license the resource uses.
License Common Public Attribution License Version 1.0 (CPAL): https://raw.githubusercontent.com/ISA-tools/isa-api/master/LICENSE.txt. It does encourage OpenScience by ensuring credit is given, and is equivalent to CC-BY-SA license.
b. Privacy/Ethics policy: If applicable, is there a publicly available privacy policy in which use and security around personal data are described (e.g. the EU General Data Protection Regulation (GDPR), ELIXIR Ethics Policy, other relevant ELIXIR Policies)? Please include the url of the privacy/ethics policy, if applicable.
Not directly applicable.
However, under the EU H2020 PhenoMeNal, ISA configurations and ISA tools have been created to include GA4GH developed controlled vocabulary to enable tracking of consent and data use requirements associated with patient based studies. The work has been performed in collaboration with UK Phenome Centres and other PhenoMenal partners. The Galaxy tool ISA-create implements such recommendations.
Functions in the ISA API allowing encryption have also been developed. However, such duties should be under the responsibility of hosting institution, running the ISA-API.
c. Funding & sustainability plan: List of funding sources supporting the resource, and sustainability plan.
The work in and around the ISA framework has been embedded and sustained by a portfolio of research and infrastructure projects awarded to Prof. Sansone, including: BBSRC COPO Collaborative Open Plant Omics infrastructure (BB/L024101/1), EC PhenoMeNal phenome and metabolome analysis infrastructure (H2020-EU.1.4.1.3, 654241), EC MultiMot cell migration infrastructure (H2020-EU.3.1, 634107), EXCELERATE (H2020-EU.1.4.1.1, 676559), IMI IMPRiND (IMI 116060) , NIH bioCADDIE Data Discovery Index (NIH 1U24AI117966-01), NIH Data Commons (NIH 1OT3OD025459-01, NIH 1OT3OD025467-01, NIH 1OT3OD025462-01). Past projects that have co-funded BioSharing activities, include but are not limited to: NIH CEDAR Centre for Extended Data Annotation and Retrieval (U54 NIH BD2K AI117925), IMI eTRIKS translational research infrastructure (IMI 115446), NERC Bioinformatics Centre partnership funds, BBSRC Omics Standards (BB/E025080/1), BBSRC ISA (BB/I000917/1), BBSRC MGportal (BB/I025840/1), EC COSMOS (FP7 312941).
d. Governance: Describe the Resource’s QA/QC plan that guarantees similar quality governance to that of ELIXIR. Please link SAB members, if applicable.
The sustainability and maintenance of the ISA framework is guided by the ISA Working Group, which includes many international researchers and developers (https://www.isacommons.org), including also members of the ELIXIR Nodes, Metabolomics and Plant User Communities, such as Rob Davey (Ealhram Institute), Ralf Weber (University of Birmingham), Cyril Pommier (Unité de Recherche Génomique Info).