Notes given in the application form
Eligibility criteria
- Must be an ELIXIR Service (i.e. part of an existing ELIXIR Node’s Service Delivery Plan, or is an ELIXIR Commissioned Service), or is in the official process/commitment of becoming one. (Required)
- Must have evidence that it supports an interoperability activity, and has been deployed. (Required)
- Must indicate how it supports the FAIR Principles. (Required)
- Should fit into the EIP service framework in the ELIXIR 2019-2023 Scientific Programme for data interoperability or other activities relevant to the ELIXIR mission.
Additional notes
- Please complete this form by adding information for your Interoperability Resource in the appropriate section below. Consult with Recommended Interoperability Resource (RIR) selection criteria documentation on details for each section below.
- Where a panel/question is not relevant to your Interoperability Resource, please leave it blank or mark as “not applicable”, optionally with a brief explanation as to why.
- Word limit guidance is noted for free text fields.
- Please include urls to external resources, where useful.
- Any questions, contact Sirarat Sarntivijai (sirarat.sarntivijai@elixir-europe.org).
1. Resource facilitation to scientific research
a. Interoperability Resource: Briefly describe the function of the Interoperability Resource
The JSON schema validator provides a library that can be used to validate data in JSON format against a specification written as a JSON schema. This enables domain-independent validation of metadata using modern web standards for data representation.
b. Scope statement: describe the scope , and the users of the resource. How is the Interoperability Resource positioned with respect to other similar Interoperability Resources? Include the base URL and, if relevant, the introductory or “about” page URL.
A generic process for metadata validation is an unfulfilled need in the Interoperability Service Reference framework, /platforms/interoperability, required by the current FAIR Service architecture task. Through the ELIXIR Validation implementation study we developed a generic metadata validator, built by the community and addressing specific use cases within and outside of ELIXIR.
Home page URL:
https://www.npmjs.com/package/elixir-jsonschema-validator
c. Resource url
https://github.com/elixir-europe/json-schema-validator
d. Inter-organisational recognition: does the Interoperability Resource have community recognition? (e.g. demonstrated through a collaboration, geographical diversity in the source of the submissions, international diversity of delivery partners and/or funders)
OpenEBench Benchmarking data model, which is described as a set of JSON Schemas (https://github.com/inab/benchmarking-data-model/tree/1.0.x/json-schemas), depends on an extended validation in order to check all the restrictions. Its validator (https://github.com/inab/extended-json-schema-validators) has been reused and further extended at FAIRTracks ecosystem, and its “term” extension (https://github.com/fairtracks/fairtracks_validator/#readme) conceptually resembles the ontological validations implemented in ELIXIR JSON Schema Validator. An ongoing implementation will adhere to the notation used by ELIXIR JSON validator.
FONDUE Implementation study: The submission of genotypic data to ELIXIR Deposition Databases like EVA or genomes to ELIXIR Core Data Resources like ENA can be long and requires a lot of trial and error, especially to ensure a high standard of metadata. Indeed, apart from the obvious categories (authors, project, laboratories, technologies used, file format) it is critical to ensure the quality of key attributes, including plant material identification (as being proposed in the MCPD and MIAPPE standards), genome assembly version, computational tools and environments used to generate the data. This data and metadata quality can be obtained with the help of data validation through the JSON schema validator.
2. Community
a. Community impact: If applicable, provide documented evidence of community impact (e.g., publication citations, API calls, projects using the resource, etc.)
Human Cell Atlas: metadata natively in JSON schema and resources used by the Data Coordination Platform to ensure conformance to those standards. HCA also contributed the ontology expansion validation capability to the resource.
EBI Data Submission Portal: resource is being used to consistently validate samples metadata at data submission point. ENA checklists have been converted from the legacy XML to JSON schema and a checklist editor is under development.
OpenEBench extended JSON validator has taken some ideas from this library implementation of ontological terms validation. This work has contributed to better design what behavior was needed to be implemented in the “term” extension for FAIRTracks ecosystem (https://fairtracks.github.io).
Global Alliance for Genomics and Health: used for Phenopackets, http://phenopackets.org, validation against the SchemaBlocks JSON schema, https://github.com/ga4gh-schemablocks/sb-phenopackets.
The Plant community needs to improve the Genomic/Genotyping submission workflow to Core Data Resources from Breeding API compliant databases. For this prototypes have been developed that take advantage of the JSON schema validator during the 2019 Biohackathon.
To position this validator as the common standard across those resources, we have been calling it internally the “ELIXIR JSON Schema validator”. If agreeable we would like to make a request to officially use the ELIXIR name.
b. Potential usage: Describe other systems that could use this candidate resource, but currently do not.
The library can be adopted by any resource provider to ensure compliance with a specification. This enables utilisation of the same validation procedures in local data management systems, domain-specific or generic repositories as well as registries. Many systems exist and more are emerging that broker submission to such services, the ELIXIR JSON schema validator enables a direct user feedback regarding compliance, rather than resolution of errors upon submission.
We are also planning to deploy the tool in the context of the FAIRplus project. This project aims at fairifying IMI and EFPIA datasets, and this obviously will require validation. IN phase 1 of the FAIRplus project we have developed a transcriptomics checklist as JSON schema, which we would like to sue in conjunction with the validator. To achieve this goal we will need to agree on a common JSON schema for users to be able to submit their own schemas.
c. Outreach & support: Provide resource support publication(s)/user documentation(s) describing the Interoperability Resource (e.g. scientific journal publications, community preprints, resource user’s documentations etc.), resource dissemination plan (e.g. workshops, conference presentations), and other equal-opportunity research support (if applicable).
- Documentation available at https://github.com/elixir-europe/json-schema-validator#readme
- Publication in preparation
- ELIXIR Webinar, available from /events/elixir-webinar-data-validation
d. Dependency of other resources: How is this resource critical to the user(s)? Do other resources depend on the resource described here to provide downstream service? Please list, or provide a link to a diagram.
To provide an interoperable ecosystem, we need to ensure compliance to standards. Currently validation is implemented in different ways by each provider because of the lack of a common platform or library to build on. This results in different interpretations for implementation and therefore different outcomes of the validation procedure. This library provides a solution to this by providing a common basis for implementation of validation of any specification described in JSON schema.
3. Quality of resource
a. Uptime: Average percentage uptime/month during the last 12 months, response time of the resource. In case of ontology/standards production, interval of update/release, adaptability of ontology design patterns to evolving data. Provide information where applicable: uptime of resource, software release cycle (please state week/month etc), update frequency.
This package can be used directly or set to run as a node server that receives validation requests and gives back results. The validation is done using the AJV library version 6.0.0 that fully supports the JSON Schema draft-07.
b. Accessibility: what are resource retrieval mechanisms? Does the resource provide web-based user interface, application programmable interface (API), containers, and/or other channels? Please list resource access mechanism, provide URLs as applicable.
The resource is a library that others can use directly as part of their code.
https://github.com/elixir-europe/json-schema-validator#readme
c. Maintenance quality: Is there a maintenance SOP or plan, reflecting sustainability and scalability? Does it align with guidelines for sustainable software development? Please include a resource commitment statement (description text or URL).
The resource’s maintenance aligns with the sustainability plan:
(1) Usability: it is maintained as a shared library, used by different projects. It is easily understood as based on established technical standards and well documented.
(2) Sustainability and maintainability: As described above we plan to apply for use of the ELIXIR branding, in addition to the existing licence on the code. Community is diverse and documented in the publication in preparation. As the validator leverages AJV at its core if is straightforward for others to understand the source and propose contributions.
d. Support quality: Please list support mechanisms (e.g., point of contact, request ticketing, resource’s response time where a solution is identified, etc.), and methods to collect user feedback. If available, list tutorial documentations or tutorial materials and format, including linking on the ELIXIR’s Training Portal (TeSS) (or other training platforms) where applicable.
The library’s public GitHub page can be used to submit support queries:
https://github.com/elixir-europe/json-schema-validator
Any user (usually a developer) who has a well enough understanding of the project can respond to the submitted issues. This platform can be used to collect user feedback as well.
The main page of the repository contains all the documentation necessary for users to get started. All the steps from installing the library to using it are explained in detail.
The library is packaged and published to the following public Node.js package repository:
https://www.npmjs.com/package/elixir-jsonschema-validator
4. Legal framework, funding, and governance
a. Legal framework: What are the resource’s license/terms of use? Can the license facilitate Open Science? Please include the url for the license the resource uses.
Apache 2.0, available from https://github.com/elixir-europe/json-schema-validator/blob/master/LICE…
b. Privacy/Ethics policy: If applicable, is there a publicly available privacy policy in which use and security around personal data are described (e.g. the EU General Data Protection Regulation (GDPR), ELIXIR Ethics Policy, other relevant ELIXIR Policies)? Please include the url of the privacy/ethics policy, if applicable.
Not applicable.
c. Funding & sustainability plan: List of funding sources supporting the resource, and sustainability plan.
The resource was supported by the ELIXIR Validation implementation study. It is now supported by core funding from EMBL-EBI as well as funding from other sources, such as the FONDUE implementation study, or the Human Cell Atlas.
d. Governance: Describe the Resource’s QA/QC plan that guarantees similar quality governance to that of ELIXIR. Please link SAB members, if applicable.
The resource has recently been released, and we have kept the governance process lightweight. Changes can be requested through GitHub, and need to be tested and approved by the current active developers of the resource, the EMBL-EBI Data Submission Portal and the Human Cell Atlas.
Resource is reviewed as part of other products and services, e.g. through the BioSamples SAB.