Identifiers.org

Notes given in the application form

Eligibility criteria

  • Must be an ELIXIR Service (i.e. be part of an existing ELIXIR Node’s Service Delivery Plan, or is ELIXIR commissioned work), or is in the official process/commitment of becoming one. (Required)
  • Must have evidence that it supports an interoperability activity, and has been deployed. (Required)
  • Must support or be forecast to support FAIR Principles. (Required)
  • Should fit into, or be forecast to fit into, the EIP roadmap for data interoperability or other activities relevant to ELIXIR mission.

Additional notes

  • Please complete this form by adding information for your Interoperability Resource in the appropriate section below. Consult with Recommended Interoperability Resource (RIR) selection criteria documentation on details for each section below.
  • Where a panel/question is not relevant to your Interoperability Resource, please leave it blank or mark as “not applicable”, optionally with a brief explanation as to why.
  • Word limit guidance is noted for free text fields.
  • Please include urls to external resources, where useful.
  • Any questions, contact Sirarat Sarntivijai (sirarat.sarntivijai@elixir-europe.org).

1. Resource facilitation to scientific research

a. Interoperability Resource: Briefly describe the function of the Interoperability Resource

URLs of biomolecular resources often change, and heavily used resources like for example UniProt, may have multiple instances and URLs around the world.

Identifiers.org provides

  1. A Registry that catalogs and stores metadata of data repositories including up to date access urls
  2. A Resolver that provides stable URLs of the form https://identifiers.org/

This isolates bioinformatics resources from the need to constantly maintain their mappings to external resources, and allows the use of Compact Identifiers like uniprot:P12345 to cite biomolecular entities in scientific text, which can then easily be converted to resolvable URLs by prefixing the resolver http://identifiers.org/

b. Scope statement: describe the scope , and the users of the resource. How is the Interoperability Resource positioned with respect to other similar Interoperability Resources ? Include the base URL and, if relevant, the introductory or “about” page URL.

The target group for Identifiers.org are

  1. Bioinformatics service providers. They can either use the identifiers.org Registry to maintain their mappings to external resources, or they can directly and systematically use the Identifiers.org Resolver, referencing all external resources in the form https://identifiers.org/
  2. Journals and other managers of scientific text production. The systematic use of Compact Identifiers like uniprot:P12345 makes references to external biomolecular data objects compact, readable, and easily resolvable.

Identifiers.org is the only manually curated resource of its kind, and is consumed by other, related resources like Prefix Commons and n2t.net.

More detail is available at https://identifiers.org/about

c. Resource url

https://identifiers.org

d. Inter-organisational recognition: does the Interoperability Resource have community recognition? (e.g. demonstrated through a collaboration, geographical diversity in the source of the submissions, international diversity of delivery partners and/or funders)

Identifiers.org has been first published in 2011 [1].

It has been developed and maintained with funding from the ELIXIR Preparatory Phase, UK BBSRC, and EMBL core funding. It is currently supported by EU FREYA and the US NIH Data Commons Pilot Phase Consortium (DCPPC).

In a Memorandum of Understanding signed in 2017, EMBL-EBI and the California Digital Library commit to joint support for resolution of Compact Identifiers through identifiers.org and n2t.net, respectively.

[1] Juty N, Le Novère N, Laibe C. Identifiers.org and MIRIAM Registry: community resources to provide persistent identification. Nucleic Acids Res. 2012 Jan;40(Database issue):D580-6. http://identifiers.org/pmc:PMC3245029

2. Community

a. Community impact: If applicable, provide documented evidence of community impact (e.g., publication citations, API calls, projects using the resource, etc.)

The original Identifiers.org publication [1] has 139 citations (Google Scholar).

Bioinformatics resources like EuropePMC, BioModels, Reactome, WikiData, and organisations like Open Targets and the Database Center for Life Science, Japan, are already using Identifiers.org for efficient maintenance of external object references.

Beyond its original Registry/Resolver function, Identifiers.org is in intensive use in the linked data community [EBI RDF, Open Phacts] as a tool to unify references to external data objects.

In the first half of 2018, an average of 10,883 unique hosts per month generated 876,520 requests per month.

Our recent publication on Compact Identifiers [2] from May 2018 already has an Altmetric Score of 40 (as of July 5, 2018). In an associated editorial, the editors state "Scientific Data is changing the way we incorporate links into our data citations. We will now be taking advantage of the resolver services offered by identifiers.org and N2T.net to provide more standardized and predictable links for biomedical datasets that have accession identifiers when they are cited in our publications." [3]

[1] Juty N, Le Novère N, Laibe C. Identifiers.org and MIRIAM Registry: community resources to provide persistent identification. Nucleic Acids Res. 2012 Jan;40(Database issue):D580-6. http://identifiers.org/pmc:PMC3245029
[2] Wimalaratne SM, et al. Uniform resolution of compact
identifiers for biomedical data. Sci Data. 2018 May 8;5:180029. http://identifiers.org/pmc:PMC5944906.
[3] On the road to robust data citation. Sci Data. 2018 May 8;5:180095. http://identifiers.org/pmc:PMC5944907.

b. Potential usage: Describe other systems that could use this candidate resource, but currently do not.

While Identifiers.org has a high community penetration in its "source community", systems biology, there is still a lot of untapped potential in the wider bioinformatics community, and many resources that could use Identifiers.org for improved, more efficient management of external data references are not (yet) using it. With the recent strong interest and funding for Globally Unique Identifiers (GUIDs), we start to see increasing adoption.

As pointed out it the editorial cited in the previous section a), the large potential for simplified referencing of biomolecular entities through Compact Identifiers in the scientific literature and resolution based on Identifiers.org is only beginning to be realised.

c. Outreach & support: Provide resource support publication(s)/user documentation(s) describing the Interoperability Resource (e.g. scientific journal publications, community preprints, resource user’s documentations etc.), resource dissemination plan (e.g. workshops, conference presentations), and other equal-opportunity research support (if applicable).

We provide the following current documentation resources:

User documentation

Compact Identifier standard

Webinars

Upcoming course (July 2018):
https://www.force11.org/fsci/workshop-registering-and-using-compact-ide…

Recent conference (January 2018):
https://pidapalooza18.sched.com/event/Cwmn/identifiersorg-compact-ident…

Recent news releases (2018):
https://www.esciencelab.org.uk/announcements/publications/2018/05/08/co…
https://www.nature.com/articles/sdata201895
https://www.ebi.ac.uk/about/news/announcement/global-standards-biomedic…
https://www.infodocket.com/2018/05/08/force-11-introducing-a-new-standa…
/news/identifiersorg-n2tnet

d. Dependency of other resources: How is this resource critical to the user(s)? Do other resources depend on the resource described here to provide downstream service? Please list, or provide a link to a diagram.

The resources listed in a) depend on Identifiers.org for the maintenance of their references to external resources. These could be replaced by in-house maintained mappings to URLs in each of the resources, which would be much less efficient and reliable.

The large-scale linked data resources Open Phacts and the EBI RDF platform depend strongly on Identifiers.org for the unification of external references, and would require significant re-engineering.

The cloud deployment of Identifiers.org in the NIH DPCCP context is still in its early stages, but is a critical component for the implementation of Globally Unique Identifiers (GUIDs) in this project, and avoids the need for an "identifier inflation" by, for example, minting DOIs for billions of objects in existing biomolecular resources like UniProt and Ensembl.

3. Quality of resource

a. Uptime: Average percentage uptime/month during the last 12 months, response time of the resource. In case of ontology/standards production, interval of update/release, adaptability of ontology design patterns to evolving data. Provide information where applicable: uptime of resource, software release cycle (please state week/month etc), update frequency.

Identifiers.org is provided through the EMBL-EBI high performance infrastructure, with two redundant, load-balanced servers in a Tier 3 IT service center in London. It is monitored internally by the Nagios system, and externally by host-tracker.com.

Any edits to Identifiers.org mappings are propagated to the production servers immediately.

b. Accessibility: what are resource retrieval mechanisms? Does the resource provide web-based user interface, application programmable interface (API), containers, and/or other channels? Please list resource access mechanism, provide URLs as applicable.

Identifiers.org provides a web-based user interface for documentation, outreach, curation, and testing, but its main use is through the web services provided by the Registry and Resolver, documented at
https://identifiers.org/documentation
https://identifiers.org/service

The registry content is also consumed and redistributed by collaborating, related resources like n2t.net at the California Digital Library and Prefix Commons.

c. Maintenance quality: Is there a maintenance SOP or plan, reflecting sustainability and scalability? Does it align with guidelines for sustainable software development? Please include a resource commitment statement (description text or URL).

The Identifiers.org code base and documentation is maintained at https://github.com/identifiers-org/.

Internal infrastructure, database and deployment procedures are documented on the EMBL-EBI Confluence website.

While the current Identifiers.org infrastructure, based on two redundant servers, has still plenty of reserve, a highly scaleable, cloud-based deployment is currently under development in the context of the US NIH Cloud Pilot project.

d. Support quality: Please list support mechanisms (e.g., point of contact, request ticketing, resource’s response time where a solution is identified, etc.), and methods to collect user feedback. If available, list tutorial documentations or tutorial materials and format, including linking on the ELIXIR’s Training Portal (TeSS) (or other training platforms) where applicable.

Prefixes are requested via https://identifiers.org/request/prefix

Comments and requests can be make via
https://www.ebi.ac.uk/support/identifiers.org
https://github.com/identifiers-org/identifiers-org.github.io/issues
identifiers-org@ebi.ac.uk

All requests are responded with in 48 hours.

4. Legal framework, funding, and governance

a. Legal framework: What are the resource’s license/terms of use? Can the license facilitate Open Science? Please include the url for the license the resource uses.

Identifiers.org uses the is CC BY 4.0 licence: https://creativecommons.org/licenses/by/4.0/

The licence is stated on http://identifiers.org/about

b. Privacy/Ethics policy: If applicable, is there a publicly available privacy policy in which use and security around personal data are described (e.g. the EU General Data Protection Regulation (GDPR), ELIXIR Ethics Policy, other relevant ELIXIR Policies)? Please include the url of the privacy/ethics policy, if applicable.

Identifiers.org is compliant with current (7/2018) GDPR regulations and shares the EMBL-EBI privacy notice at https://www.ebi.ac.uk/data-protection/privacy-notice/embl-ebi-public-we….

c. Funding & sustainability plan: List of funding sources supporting the resource, and sustainability plan.

Identifiers.org has been developed and maintained with funding from the ELIXIR Preparatory Phase, UK BBSRC, and EMBL core funding. It is currently supported by EU FREYA (2018-2020) and the US NIH Data Commons Pilot Phase Consortium (DCPPC)(2018, extension to 2022 pending) and EMBL core funding.

High performance infrastructure is provided by EMBL-EBI with support from the UK BBSRC, and is expected to be stably provided long term, in line with other EMBL-EBI services. The salary for the responsible team leader is provided by EMBL-EBI. The salary for operation and basic management of curation requests (estimated 0.2 FTE) is provided by EMBL-EBI. Grant funding is required for software development, extension of scope, and adaptation to requirements of the community.

d. Governance: Describe the Resource’s QA/QC plan that guarantees similar quality governance to that of ELIXIR. Please link SAB members, if applicable.

Identifiers.org is embedded into the EMBL-EBI governance structure and, as a comparatively small resource does not have its own separate SAB.

Changes to the name spaces in the Identifiers.org Registry are determined jointly with the California Digital Library to ensure global uniqueness and full synchronisation between both resolvers.