Feed aggregator

chem-bla-ics : OpenTox Euro 2016: "Data integration with identifiers and ontologies"

Planet Bioclipse - Sun, 2016-11-13 12:45
Results from a project by MSP students.
J. Windsor et al. (2016): Volatile Organic Compounds:
A Detailed Account of Identity, Origin,
Activity and Pathways
. Figshare.A few weeks ago OpenTox Euro 2016 meeting was held in Rheinfelden at the German/Swiss border (which allowed me a nice stroll across the Rhine into Switzerland and by a nice x-mas countdown clock. The meeting was co-located with eNanoMapper-hosted meetings, where we discussed, among other things the nanoinformatics roadmaps, that outline where research in this area should go to.

There were many interesting talks, around various data initiatives, adverse outcome pathways (AOPs) and their links to molecular initiating events (MIEs), and ontologies (like the AOP ontology talk by ). In fact, I quite enjoyed the discussion with Chris Grulke about ontologies during the panel discussion. Central was, where is the border between data and ontological concepts. Some slides are available via Lanyrd.

During the Emerging Methods and Practice session hosted by Ola Spjuth, I presented the work at the BiGCaT department into identifier mapping and the use of ontologies for linking data sets.


Data integration with identifiers and ontologies from Egon Willighagen
The presentation integrates a lot of things I have been working on in the last few years, and please note the second slide with all people I have worked with on things presented in these slides.

chem-bla-ics : New paper: "SPLASH, a hashed identifier for mass spectra"

Planet Bioclipse - Fri, 2016-11-11 16:16
I'm excited to have contributed to this important (IMHO) interoperability paper around metabolomics data: "SPLASH, a hashed identifier for mass spectra" (doi:10.1038/nbt.3689, readcube:msZj). A huge thanks to all involved in the great collaborative project! The source code project is fully open source and coordinated by Gert Wolgemuth, the lead author on this paper. It provides an implementation of the algorithm in various programming languages and I'm happy that the splash functionality is available in the just released Bioclipse 2.6.2 (taking advantage of the Java library). An R package by Steffen Neumann is also available.

This new identifier greatly simplifies linking between spectral databases and will in the end contribute to a Linked Data network. Furthermore, journals can start adopting this identifier and list the 'splash' for mass spectra in document, allowing for simplified dereplication and finding additional information around spectra.

There are several databases that have adopted the SPLASH already, such as MassBank, HMDB, MetaboLights, and the OSDB published in JCheminf recently (doi:10.1186/s13321-016-0170-2).


Screenshot snippet of a spectrum in the OSDB.
PS. I personally don't like the idea of ReadCubes (which I may blog about at some point) and how they have been pitched as a "legal" way of sharing papers, but this journal does not have a gold Open Access option, unfortunately.

Wohlgemuth, G., Mehta, S. S., Mejia, R. F., Neumann, S., Pedrosa, D., Pluskal, T., Schymanski, E. L., Willighagen, E. L., Wilson, M., Wishart, D. S., Arita, M., Dorrestein, P. C., Bandeira, N., Wang, M., Schulze, T., Salek, R. M., Steinbeck, C., Nainala, V. C., Mistrik, R., Nishioka, T., Fiehn, O., Nov. 2016. SPLASH, a hashed identifier for mass spectra. Nature Biotechnology 34 (11), 1099-1101.
http://dx.doi.org/10.1038/nbt.3689