ELIXIR BioHackathon 2021 by ESR15

TranSYS ESR Sonja Katz has joined the 4th annual ELIXIR BioHackathon as participant and reports on her impressions of the event.

Last week researchers, developers, and passionate hackers from all over the world gathered in Barcelona for the 4th edition of the ELIXIR BioHackathon 2021. This article will give you a brief overview of the #BioHackEU21 and what TranSYS ESRs, like myself, can gain from participating in events like these.

What was the ELIXIR BioHackathon 2021?

The one-week event (8th-12th November 2021) hosted by ELIXIR Europe was the perfect opportunity to engage with people from all areas in Bioinformatics to collaborate on a joint software project. ELIXIR describes itself as an intergovernmental organisation with the aim to bring together life science resources, such as databases, software tools, and training material from across Europe, to form a single infrastructure. This infrastructure facilitates researchers to find and share data and exchange expertise [1].

The BioHackathon unifies these ideas into a single event – not only did it provide the chance to network and exchange ideas, but also kick-started novel collaborations through hands-on programming activities.  The hybrid event with more than 420 participants from all over the world (including not only Europe but also the US, Japan, and Australia) worked to advance a total of 37 different projects, ranging from standardised workflows, ontology tooling, metadata validation, to training and many more.

MOWL: Machine Learning with Ontologies

My scientific interests aligned very well with a particular project presented: the application of machine learning to biomedical ontologies [2]. The project was lead by Maxat Kulmanov (https://orcid.org/0000-0003-1710-1820), Postdoctoral Research Fellow in the Bio-Ontology Research Group at the King Abdullah University of Science and Technology located in Thuwal, Saudi Arabia, and his colleague Fernando Patricio Zhapa Camacho (https://orcid.org/0000-0002-0710-2259), MS/PhD Student.

Ontologies are a method for organising knowledge by defining a set of concepts, categories, and rules describing subjects and their relation. They are at the heart of almost every biological database. Recently, ontologies are increasingly being used in machine learning models. Although models are under active development, promising applications include genotype–phenotype association prediction, protein function prediction, drug–target prediction, protein–protein interaction prediction, and gene–disease association prediction (as reviewed by Kulmanov et al. in [3]).  As a variety of conceptually different methods exist on how to apply machine learning for ontologies, the aim of this BioHackathon project was to develop an easy-to-use library and toolkit where users can apply these methods to their biological ontologies and associated data. The developed library will be available as an open-source project.

Proving just how diversely applicable respective machine learning models are is reflected in the composition of the MOWL hacking group: beside an experienced software developer from Berlin, it included ESRs from three different ITNs (Disc4All – H2020-MSCA-ITN-2020; TranSYS – H2020-MSCA-ITN-2019; PERICO – H2020-MSCA-ITN-2018). After 25 hours of hacking action, our group has managed to not only implement 6 different methods, but also curate 2 biological datasets (protein-protein interaction, gene-disease association) for showcasing the MOWL library and initialised a Wiki for documentation.
Our efforts can be found in the official GitHub repository [4] and will be made available as pre-print through BioHackrXiv [5].

Final words

In my opinion, the ELIXIR BioHackathon represents the perfect opportunity for young researchers to start establishing themselves in the Bioinformatics community. By joining an event that is not meant to be a competition but rather tries to create an inclusive feeling, participants can truly feel that communication, international collaboration, and learning from each other belongs to the most enjoyable sides of being a researcher. The amount of passion and motivation for the field conveyed in this week will surely fuel my own ambitions for the foreseeable future!

Table 1 Group photo with all F2F participants

Many thanks to @ELIXIREurope for coming up with such a great event, and to the participants for making it one. Special thanks to the MOWL project group – hope to see all of you at the next BioHackathon!

For more information visit

[1] https://elixir-europe.org/about-us
[2] https://github.com/elixir-europe/bioHackathon-projects-2021/tree/main/projects/27
[3] Kulmanov, M., Smaili, F. Z., Gao, X., & Hoehndorf, R. (2021). Semantic similarity and machine learning with ontologies. Briefings in bioinformatics, 22(4), bbaa199.
[4] https://github.com/bio-ontology-research-group/mowl/tree/master/mowl
[5] https://biohackrxiv.org/

.

Author: Sonja Katz

ESR15: Developing and demonstrating data mining and A.I. tools to better understand patient heterogeneity and assist patient stratification