Tutorials & workshops

Tutorials
Workshops
Evaluation committee
Rules and responsibilities
Open and FAIR

The [BC]²Basel Computational Biology Conference 2025 will feature a day of tutorials & workshops on Monday 8 September 2025.

Key dates

10 March 2025 - Registrations open
30 May 2025 - Final and detailed schedule due (incl. name of presenters)
8 September 2025 - Tutorials and workshopsfrom 9:00 - 16:15 at Biozentrum Basel (new building).

[BC]²tutorials and workshops provide an informal setting to learn about the latest bioinformatics methods, discuss technical issues, exchange research ideas, and share practical experiences on focused or emerging topics in bioinformatics. They only take place on-site, at Biozentrum, Basel.

General schedule

Time	Activity
08:15 – 09:00	Registration at Biozentrum
09:00 – 10:30	Tutorials & workshops
10:30 – 10:45	Coffee break
10:45 – 12:15	Tutorials & workshops
12:15 – 13:00	Lunch break
13:00 – 14:30	Tutorials & workshops
14:30 - 14:45	Coffee break
14:45 - 16:15	Tutorials & workshops
17:00	[BC]² Welcome lecture at the Basel congress centre

Please note that you need to register for your tutorial or workshop of choice via the online registration system (once registrations are open), they are not included in the [BC]² registration fee. Tutorials and workshops have a limited number of participants they can take - register on time!

Tutorials

Tutorials aim to provide participants with lectures or practical training covering topics relevant to the bioinformatics field. It offers participants an opportunity to learn about new areas of bioinformatics research, to get an introduction to important concepts, or to develop advanced skills in areas they are already familiar with. Each tutorial will be organized as a full day (9:00 - 16:15) or a half-day event (9:00 - 12:15 or 13:00 - 16:15). If you choose a half day tutorial, you are welcome to register for another half day tutorial if space is available.

All tutorials take place at the Biozentrum -the new building- (University of Basel, Spitalstrasse 41, 4056 Basel).

T1 - Analyzing lineage data with BEAST 2: an introduction to TiDeTree (1/2 day - AM)

Overview

artificial intelligence - cellular process - phylogeny - single-cell biology

This tutorial is designed to empower researchers with the skills to analyze genetic lineage tracing data using TiDeTree, a Bayesian phylogenetic framework integrated within the BEAST 2 platform. TiDeTree enables the inference of time-scaled phylogenies and the estimation of population dynamic parameters—such as cell division, death, and differentiation rates—from genetic lineage tracing data with random heritable edits.

Participants will gain hands-on experience in setting up and running TiDeTree analyses, starting with an overview of Bayesian phylogenetic methods and lineage tracing principles. The session will cover essential steps such as preparing input data, configuring TiDeTree editing models, setting up phylodynamic models of cell dynamics and interpreting output files. Practical exercises will allow participants to apply TiDeTree to example datasets, enhancing their understanding of its applications. Additionally, a concluding session will demonstrate how to use the time-scaled phylogenies estimated with TiDeTree for downstream analyses, including integrating additional data sources such as single-cell RNA sequencing data and performing hypothesis testing with phylogenetic comparative models.

By the end of the tutorial, attendees will be equipped to integrate TiDeTree into their research workflows, analyze lineage data effectively, and gain deeper insights into cellular processes such as development and differentiation.

Learning objectives

Understand the basic theory behind using Bayesian inference to estimate time-scale phylogenies from genetic lineage tracing data.
Gain hands-on experience in preparing input data, configuring TiDeTree models, and running analyses within the BEAST 2 framework.
Learn how to use TiDeTree outputs for integrating additional data sources such as single-cell RNA sequencing data

Presenters

Sophie Seidel, ETH Zürich, Switzerland
Antoine Zwaans, ETH Zürich, Switzerland

Schedule

Time	Activity
09:00 - 09:45	Introduction and overview of Bayesian Phylogenetics Welcome and introduction round Overview of lineage tracing and Bayesian phylogenetics
09:45 – 10:30	Session 1: introduction & running TiDeTree Overview of TiDeTree editing model How to set up a TiDeTree analysis: Preparing input data Setting up editing models (e.g., CRISPR-Cas9, recombinase). Phylodynamic models for cell dynamics (e.g. using birth-death and coalescent models with time-varying parameters). Setting Priors Setting MCMC options Exercise: Running the analysis
10:30 – 10:45	Coffee break
10:45 – 12:15	Session 2: interpreting outputs and downstream applications Analyzing the output in Tracer: Assessing convergence of the MCMC chain Assess parameter correlations Analyzing tree estimates: Produce a summary tree using TreeAnnotator Visualize the posterior distribution of trees using DensiTree Analyze cell dynamic parameter estimates using skyline plots. Overview of methods to analyze TiDeTree outputs: Integrating single-cell RNA sequencing data via discrete or continuous trait evolution models (15 min) Hypothesis testing using phylogenetic comparative methods.
12:05 – 12:15	Wrap-up and Q&A: Recap of key concepts Open Q&A

Audience

Maximum number of participants: 15

The target audience consists of scientists wanting to:

analyze single-cell genetic lineage tracing data
get estimates of parameter uncertainty from their analysis
include a priori information from their experiment or model system
understand more about how to use cell phylogenies in their downstream analyses

Requirements

A computer with BEAST 2 installed, and associated GitHub tutorial downloaded.

Organizer

Sophie Seidel, ETH Zürich, Switzerland

T2 - Bioinformatic analyses and visualization of genetic and functional data in the context of proteins using Genomics 2 Proteins portal resources (1/2 day - PM)

Overview

data visualisation - functional genomics - genetic variation - structural biology

Have you ever tried to investigate your target gene by mapping mutations, together with functional/genomic annotation (domains, scores, etc.) on protein 3D structure — and found: it’s not so easy to do that? This interactive tutorial will enable you to unlock the potential of the Genomics 2 Proteins (G2P) portal – an integrated and interactive platform to connect genetic screening output to protein sequences and structures (Nature Methods 21, 1947–1957 (2024); DOI: 10.1038/s41592-024-02409-0).

In the era of big data and AI, researchers have unprecedented access to computationally predicted as well as experimentally solved protein structures, genetic variations, clinical and functional annotations of mutations, and rich, curated annotations of proteins. In this tutorial, participants will learn how to connect genetic variants, synthetic mutations, and their functional annotations with protein sequences and structures, efficiently and accurately, through computational methods and visualization techniques. We will dive into hands-on case studies, from mapping disease-associated variants and/or mutagenesis readouts to customizing your protein structure analysis. Whether you are a beginner or an advanced user, this tutorial will equip you with the tools to seamlessly bridge genomics and structural biology.

For computationally advanced users, the tutorial will provide training on using API-based workflows and coding techniques to customize analyses and automate processes. Practical sessions will feature real-world case studies, including mapping pathogenic mutations onto protein structures, integrating functional genomics data (base-editing), and exploring druggable pockets in therapeutic targets. In summary, the tutorial will provide hands-on experience in cutting-edge bioinformatic workflows, enabling attendees to address research questions spanning genetics, structural biology, and functional genomics.

Learning objectives

Understand the data types and principles of linking genetic, functional, and structural data using bioinformatics tools.
Our tutorial will cover major protein-level data (UniProt data, protein-protein interactions, druggable pocket, and post-translational modification data),structural data (Protein Data Bank and AlphaFold), the largest genetic variant data (gnomAD; ClinVar; and Human Gene Mutation Databases; and functional data from MaveDB)
Learn to use the Genomics 2 Proteins (G2P) portal for mapping and interpreting variants (genetic or synthetic), scores (quantitative data), and feature annotations (discrete data) onto protein sequences and structures.
Multi-modal data analyses and visualization in the context of protein sequences and structures will be demonstrated. For example, “mapping conservation scores, variant effect prediction scores, or mutagenesis readouts (quantitative score data) onto proteins simultaneously with the mutations” to generate hypotheses about interpreting and translating variant data.
Develop skills and gain experience in visualizing and analyzing genomic annotations on protein structures using case studies.
Clinical genetic variant analysts will learn how to analyze new, de novo variants in the context of known variants along with protein features to answer questions such as “Is the de novo variant located in an annotated functional domain of the target protein that also lacks common population variants?”Structural biologists will be able to identify important, potentially druggable protein regions using genetic data, for example, “Does a cancer mutation located at the small molecule binding pocket and is a part of the cluster of known drug-resistance mutations in patients?”, “Does a residue located on the binding interface of an oligomer complex, which is also a phosphorylation site, and has a high binding free energy upon mutations indicating that the mutation perturbs the binding?”Molecular biologists will learn how to analyze and visualize mutagenesis readouts onto target protein’s sequence and structure to find answers to questions like “Are two or multiple different topological domains in proteins relatively enriched with mutations associated with distinct phenotypes, indicating multiple disease mechanisms?”
Get trained on advanced capabilities of the G2P portal, including software, API usage, and customization for specific datasets
Bioinformaticians and computational scientists will learn how to generate data and features that can be directly fed to machine learning model training and testing
Going beyond data available in existing databases and gaining experience in analyzing your own data to generate your research hypotheses
Learn how to upload in-house, not-yet-published protein structures or predicted structures from your favorite AI model and analyze mutational and functional data (often useful for therapeutic scientists)
Learn how to upload your own mutational and functional data generated using gene-editing and/or multiplexed approaches and analyze the readouts on protein structures (often a demanding skill for molecular and chemical biologists)
Learn how to download your analysis outputs in interoperable formats to load in PyMOL protein structure viewer for downstream research

Presenters

Sumaiya Iqbal, Broad Institute of MIT and Harvard
Jordan Safer, Broad Institute of MIT

Schedule

Time	Activity
13:00 – 13:30	Introduction Background and rationale Bioinformatic framework of the G2P portal Resources integrated into the G2P portal Two main modules: gene/protein lookup and interactive mapping
13:30 – 14:30	Tutorial session I: G2P portal as a human proteome-wide resource for connecting variants to protein sequence and structure Introduction of the Gene/Protein Lookup module of the portal Connecting gene transcript protein identifiers Visualization of protein feature annotations and genetic variants from databases on protein sequence and structure viewer Data export for downstream analysis Case study: Mapping pathogenic and population variants of MORC2 onto protein structures
14:30 – 14:45	Coffee break
14:45 – 15:30	Tutorial session II: Analyze your annotation data on a publicly available protein sequence and structure Introduction of the interactive mapping – start with a gene/protein identifier How to harmonize user-uploaded annotations on protein sequences on selected structures Visualization of different data types (mutations, scores, features) in the protein sequence and structure viewer Case study: Mapping functional data from MaveDB and druggable pocket annotations onto PDB and AlphaFold structures
15:30 – 16:15	Tutorial session III: Analyze your own protein structures with annotations Introduction of the Interactive Mapping – start with your own protein structures How to leverage in-house protein structures and visualize with user-customized annotations on the sequences Applicability beyond human proteome Case study: Mapping DNMT3A base-editing screening output to ESMFold predicted structures

Audience

Maximum number of participants: 30

The tutorial will be open to participants at all levels, including undergraduates, graduate students, and both academic and industrial professionals, particularly those working in genetics, genomics, or structural biology and functional/clinical genomics who are eager to expand their knowledge across multi-omics fields or leverage large datasets. By focusing on accessible and practical bioinformatic applications, we aim to equip participants with the skills to integrate genetic, functional, and protein structural data more effectively into their research. By offering hands-on training and expert guidance, this tutorial will meet a critical need for bioinformatics education, helping researchers from diverse fields to harness the power of next-generation computational biology tools to accelerate their research.

Some example audience and applications are:

Bioinformaticians and computational scientists will learn how to generate data and features that can be directly fed to machine learning model training and testingClinical genetic variant analysts will learn how to analyze new, de novo variants in the context of known variants along with protein features to answer questions such as “Is the de novo variant located in an annotated functional domain of the target protein that also lacks common population variants?”
Structural biologists will be able to identify important, potentially druggable protein regions using genetic data, for example, “Does a cancer mutation located at the small molecule binding pocket and is a part of the cluster of known drug-resistance mutations in patients?”, “Does a residue located on the binding interface of an oligomer complex, which is also a phosphorylation site, and has a high binding free energy upon mutations indicating that the mutation perturbs the binding?”
Molecular biologists will learn how to analyze and visualize mutagenesis readouts onto target protein’s sequence and structure to find answers to questions like “Are two or multiple different topological domains in proteins relatively enriched with mutations associated with distinct phenotypes, indicating multiple disease mechanisms?”

Example audience levels:

Beginner: No prior bioinformatics experience is required for Sessions I–III (see agenda).
Intermediate/advanced: Prior experience with Python/bash scripting is recommended for Session IV (see agenda).
Preferred background: Familiarity with resources such as UniProt, PDB, AlphaFoldDB, and tools like PyMOL is advantageous but not mandatory

Requirements

Preferred software: Python, Jupyter Notebook, PyMOL

Organizer

Sumaiya Iqbal, Broad Institute of MIT and Harvard, UK

T3 - Biomedical knowledge graphs meet language models (1/2 day - PM)

Overview

artificial intelligence - ontology - open data - text mining - machine learning

This tutorial focuses on the complementary nature of biomedical knowledge graphs (KGs) and large language models (LLMs), offering a comprehensive exploration of their individual strengths and synergistic potential. Participants will gain a deep understanding of how KGs, with their structured representation of biological knowledge, and LLMs, with their powerful natural language processing capabilities, can address key challenges in the life sciences, supporting evidence-based reasoning and discovery.

The session begins with an introduction to the core concepts, methodologies, and motivations behind the integration of KGs and LLMs. It highlights the natural alignment of KGs with the structure of biological information and the complementary strengths of symbolic and neural approaches in managing and leveraging biomedical data.

The session will then delve into how LLMs can support the lifecycle of biomedical data analysis, interpretation and discovery, motivated by concrete use cases in the areas of antibiotic discovery and oncology (but extensible to other areas). We will describe the key methodological and architectural paradigms at the LLM-KG interface for supporting complex biomedical reasoning, covering knowledge representation, information extraction, querying and inference, all situated within concrete value-delivering use cases. The tutorial will also explore the construction of generative AI agents and their underlying infrastructure (e.g., retrieval augmented generation).

By the end of the tutorial, participants will understand how the coordination between KGs and LLMs can deliver AI-based inference over real biomedical problems. Participants will be equipped with practical methods, showcased with hands-on demonstrations, for leveraging their combined strengths in biomedical applications.

Learning objectives

Understanding of the core concepts and synergy between knowledge graphs and language models and how these computational paradigms can deliver analytical and discovery value in the biomedical area.
Gain practical knowledge: from the construction of knowledge graphs to complex inferences for knowledge discovery assisted by LLM agents on real-world biomedical use cases.

Presenters

Maxime Delmas, Idiap Research Institute, Switzerland
Andre Freitas - Idiap Research Institute, Switzerland; University of Manchester, UK

Schedule

Time	Activity
13:00 – 13:30	Introduction and motivation: overview of biomedical discovery needs and the role of knowledge graphs (KGs) and large language models (LLMs)
13:30 – 14:30	Building and using biomedical Knowledge Graphs : exploring how LLMs can support various stages of managing the Knowledge Graph lifecycle in biomedical contexts
14:30 – 14:45	Coffee break
14:45 – 16:05	AI-based inference and discovery: using KGs and LLMs for reasoning, integrating textual evidence, mitigating hallucinations, and building AI-driven pipelines
16:05 – 16:15	Synthesis & conclusion: recap of the key concepts and practical insights covered during the session, accompanied by actionable take-home messages.

Audience

Maximum number of participants: 40

This tutorial is aimed at any computational biologist interested in the applications of knowledge graphs and language models to biological and clinical problems. No prior experience with knowledge graphs or language models is expected, and the tutorial will serve as an efficient onboarding in this exciting area. Participants should have a general background in machine learning, deep learning, and Python.

Requirements

The tutorial is accessible to a general biomedical audience wanting to understand the principles and capabilities of contemporary LLM, KG, and agent infrastructures. Some parts of the tutorial will demonstrate code-based examples (Python), but no software development background is assumed nor required.

Organizers

Maxime Delmas, Idiap Research Institute, Switzerland

T4 - Build a microbial genomic sequence database with Loculus (full day)

Overview

API - data management - infectious disease - services and resources

This tutorial provides an introduction to Loculus, a software package designed to power microbial genomic databases. It is used in projects like Pathoplexus and GenSpectrum. The course is aimed at bioinformaticians who want to streamline data storage, preprocessing, analysis, and sharing, and who are interested in setting up and managing databases for both global platforms and lab-specific use. The session will combine theoretical concepts with hands-on application to help participants understand and implement Loculus effectively.

The tutorial will begin with an overview of Loculus' features, purpose, and architecture, with a focus on its modular design and customizability. Participants will learn how data preprocessing works and how to develop custom pipelines for their needs. In the second part, participants will brainstorm database ideas in small groups and configure prototype Loculus instances on provided servers, gaining practical experience in setup, configuration, and customization.

By the end of the day, participants will have an in-depth understanding of the Loculus system and will have created prototype instances for their own data.

Learning objectives

In this tutorial, participants will learn:

How to use Loculus via website and API
Understanding the use cases and scope of Loculus and how to design an instance for your specific case
The modular architecture of Loculus, how data is preprocessed, and how to implement your own preprocessing pipeline
How to deploy a Loculus instance on your own server

Presenters

Chaoran Chen, ETH Zurich, SIB, Switzerland
Anna Parker, ETH Zurich, SIB, Switzerland

Schedule

Time	Activity
09:00 – 09:30	Introduction: overview of Loculus, features, scope, and motivations
09:30 – 09:50	System overview: architecture of Loculus, components and customizability
09:50 - 10:30	Data preprocessing: how data is preprocessed in a Loculus instance and developing preprocessing pipelines
10:30 – 10:45	Coffee break
10:45 – 11:30	Data models and configurations: defining schemas and data modelling approaches
11:30 – 12:15	Brainstorming: group discussion on new database ideas (public and private databases)
12:15 – 13:00	Lunch break
13:00 – 14:30	Group work (part 1/2): participants form groups to design and configure a Loculus instance
14:30 – 14:45	Coffee break
14:45 – 15:30	Group work (part 2/2)
15:30 – 16:15	Group presentations: groups present their results and final discussions

Audience and requirements

Maximum number of participants: 25

This tutorial is aimed at bioinformaticians interested in setting up and managing microbial genomic sequence databases, whether for global platforms or lab-specific use.

Requirements

Laptops are needed for participants to configure and set up Loculus, but multiple participants can share a laptop. The only requirement for the laptop is the ability to have an SSH client to connect to a remote server that will be provided.

Organizer

Chaoran Chen, ETH Zurich, SIB, Switzerland

T5 - From concept to community: building and maintaining scientific resources (1/2 day - AM)

Overview

community - services and resources - sustainability - usability - scalability - FAIR principles

This session will provide participants with the foundational knowledge and strategies needed to conceptualize, build, and maintain impactful scientific resources. Key topics will include identifying target audiences, defining essential features, and creating minimal viable products (MVPs) that prioritise usability and scalability. Participants will learn best practices for designing user-friendly interfaces, fostering community engagement, and building robust APIs that enhance interoperability with other tools.

The session will also explore critical aspects of long-term sustainability, resource management, and security measures to ensure data privacy and compliance. Emphasis will be placed on the importance of data accessibility, transparency, provenance, and reproducibility, with guidance on versioning and FAIR principles. Participants will gain insights into collecting and integrating user feedback, as well as monitoring usage through analytics to continually improve their tools.

The workshop will address common challenges, such as balancing feature sets with project scope, scaling resources, and managing documentation. Interactive discussions and activities will encourage participants to apply these principles to their own projects. By the end of the session, attendees will have a roadmap for building scientific resources that are not only functional but also sustainable, secure, and user-centric.

Learning objectives

Understand the key principles involved in designing, building, and maintaining scientific resources.
Prioritize feature sets and data transparency to enhance usability.
Gain insights into monitoring resource usage, identifying performance issues, fixing bugs, and improving overall functionality.
Learn to develop secure, interoperable APIs and apply FAIR principles for effective data management.
Explore best practices for engaging users and fostering a supportive community around scientific resources.

Presenters

Qingyao Huang, SIB Swiss Institute of Bioinformatics / University of Zurich, Switzerland
Damian Szklarczyk, SIB Swiss Institute of Bioinformatics / University of Zurich, Switzerland

Schedule

Time	Activity
09:00 – 09:30	Introduction to scientific resources
09:30 – 10:30
10:30 – 10:45	Coffee break
10:45 - 11:15
11:15 – 12:00
12:00 – 12:15	Q&A and closing remarks

Audience and requirements

Maximum number of participants: 25

This tutorial is aimed at researchers and developers who are interested in designing, building, and maintaining scientific resources. It is particularly relevant for those who manage data platforms, tools, or services and want to enhance usability, interoperability, and sustainability.

Participants should bring a laptop to take part in hands-on activities. No specific software installation is required, but familiarity with APIs, data management, or software development will be beneficial.

Organizer

Damian Szklarczyk, SIB Swiss Institute of Bioinformatics / University of Zurich, Switzerland

T6 - How to make best use of Cellosaurus (1/2 day - PM)

Overview

cellosaurus - ontology - API - services and resources - CLASTR - data integration

This tutorial will provide an in-depth introduction to Cellosaurus, a comprehensive knowledge resource on cell lines. Participants will explore its content, understand the ontologies it relies on, and learn how to use its API, triple store, and the CLASTR tool. The session will offer practical guidance for both bioinformaticians and life scientists, demonstrating how to retrieve and integrate Cellosaurus data efficiently.

Presenters

Amos Bairoch, Université de Genève and SIB Swiss Institute of Bioinformatics, Switzerland
Pierre-André Michel, SIB Swiss Institute of Bioinformatics, Switzerland

Learning objectives

Bioinformaticians: Learn how to programmatically access and integrate Cellosaurus data into workflows and other resources.
Life scientists: Understand what information Cellosaurus provides and how to retrieve relevant data effectively.

Schedule

Time	Activity
13:00 – 13:15	Introduction – Overview of Cellosaurus and its history
13:15– 14:15	Exploring Cellosaurus fields, with a focus on ontologies and their strengths and limitations
14:15 - 14:30	CLASTR – Using CLASTR in batch mode and via its API (1/2)
14:30 – 14:45	Coffee break
14:45 – 15:00	CLASTR – Using CLASTR in batch mode and via its API (2/2)
15:00 - 15:45	Cellosaurus API: methods, output format and advanced features (e.g., booleans, sorting)
15:45 - 16:10	Triple store/RDF: examples of complex queries and federated queries
16:10 – 16:15	Conclusion

Target audience and requirements

Maximum number of participants: 20

The tutorial is aimed at bioinformaticians and life scientists that want to make full use of what the Cellosaurus can provide them.

Organizer

Amos Bairoch, Université de Genève and SIB Swiss Institute of Bioinformatics, Switzerland

T7 - How to... graphical abstracts (1/2 day - AM)

Overview

outreach - training - workflows

Explanatory visualisations, such as graphical abstracts, are powerful tools for summarizing complex life science concepts. This tutorial introduces key design principles and hands-on techniques for creating clear and engaging graphical abstracts. Participants will learn to identify key messages, select visual elements, and use open-access resources, including BioIcons. The session combines lectures, group discussions, and practical exercises to enhance participants’ ability to communicate research visually.

Presenters

Helena Jambor, FHGR, Chur, Switzerland
Simon Dürr, EPFL, Lausanne, Switzerland

Learning objectives

Participants will:

Understand the principles of effective graphical abstracts.
Learn to identify and integrate key visual elements, such as icons, images, and charts.
Gain hands-on experience with open-access tools for explanatory visualisation.
Apply accessibility best practices for inclusivity.
Engage in peer discussions to refine graphical abstract designs.

Schedule

Time	Activity
09:00 – 09:05	Welcome and ice-breaker: Group exercise.
09:05 – 09:20	Introduction: The role of graphical abstracts in scientific communication
09:20 - 10:00	Design principles: key message, icons, layout, and colours (lecture and group exercise)
10:00 - 10:20	Tools and resources: overview, including BioIcons introduction (lecture)
10:20 - 10:30	Hands-on exercise: defining a key message
10:30 – 10:45	Coffee break
10:45 – 11:35	Hands-on session: prototyping a graphical abstract (individual exercise)
11:35 – 12:00	Peer feedback: group exercise and discussion
12:00 – 12:15	Wrap-up and Q&A

Target audience and requirements

Maximum number of participants: 40

This tutorial is aimed at:

PhD students and postdocs creating visual summaries for manuscripts.
Researchers incorporating graphics in grant applications.
Core facility staff enhancing training with information design principles.

Organizers

Helena Jambor, FHGR, Chur, Switzerland
Simon Dürr, EPFL, Lausanne, Switzerland

T8 - LLMs in genomics: decoding genomic sequences with advanced representations (1/2 day - AM)

Overview

large language models - genomics - machine learning - artificial intelligence - genes and genomes

Machine learning (ML) has been applied to various omics challenges, such as sequence classification in genomics. The effectiveness of ML methods depends greatly on the selection of the data representation, or features, that extract meaningful information from sequences. Recent advances in artificial intelligence have introduced powerful large language models (LLMs) that are contributing to genomic sequence analysis. These models offer an approach to sequence classification by automatically learning complex patterns from raw DNA and RNA sequences, reducing reliance on manually crafted features. In genomics, LLMs have demonstrated the ability to extract both local and global patterns from genomic sequences, enabling the development of highly effective numerical representations. Approaches such as DNABERT, Nucleotide Transformer, and ESM have shown great promise in capturing the complexities and hidden patterns of biological sequences, leveraging their pre-trained knowledge to address genomic challenges with minimal task-specific fine-tuning.

This tutorial focuses on the application of LLMs to genomic sequence prediction tasks. Participants will gain an understanding of how these models work, their training paradigms, and their advantages in genomic research. Through hands-on exercises, attendees will explore the practical use of LLMs for tokenisation, sequence embedding, and fine-tuning for downstream tasks. The learning modules will be based on two LLMs, DNABERT2 and Nucleotide Transformer, serving as examples of how to use different models and compare their results. The goal is to prepare participants with the skills and knowledge to utilize LLMs for genomic sequence analysis, advancing the field of genomics with state-of-the-art AI techniques.

Presenters

Aparajita Karmakar, Rosalind Franklin Institute, UK
Alexandre Paschoal, Rosalind Franklin Institute, UK
Fabiana Rodrigues de Goes, Rosalind Franklin Institute, UK

Learning objectives

Participants will learn to analyze and extract numerical representations from genomic sequences (DNA/RNA) using large language models (LLMs). Following the tutorial, they will be able to:

Understand the evolution of genomic sequence analysis methods, with a focus on the transition from traditional feature engineering to LLMs.
Comprehend the fundamental principles behind LLMs, including their architectures, training paradigms, and the biological insights they can capture from genomic sequences.
Explore state-of-the-art genomic sequence embeddings generated by models such as Nucleotide Transformer and DNABERT2 and understand their advantages over hand-crafted features in capturing complex sequence patterns.
Apply machine learning pipelines based on embeddings generated by LLMs, addressing downstream tasks in genomics, such as transcription factor prediction, promoter detection, and splice site detection.
Gain hands-on experience in fine-tuning LLMs and applying them to genomic downstream tasks, leveraging tools and libraries like Hugging Face Transformers.
Evaluate and compare the performance of different LLMs for genomic sequence classification, assessing their effectiveness and scalability for real-world genomic research challenges.

Schedule

Time	Activity
09:00 – 09:45	Lecture: Introduction to Large Language Models (LLMs) and Genomics Overview of LLMs and their impact on genomics Key concepts: tokenization, embeddings, and the pretraining-finetuning paradigm Genomic LLMs: DNABERT2 and Nucleotide Transformer
09:45 – 10:30	Hands-on training: Tokenization and representation learning for genomic sequences Applying tokenisation to genomic sequences Using DNABERT2 and Nucleotide Transformer to extract embeddings Visualizing sequence representations
10:30 – 10:45	Coffee break
10:45 – 11:15	Lecture: Fine-tuning LLMs for genomic downstream tasks Adjusting pre-trained LLMs for downstream tasks, such as transcription factor prediction, promoter detection, and splice site detection Overview of the fine-tuning pipeline
11:15 – 12:15	Hands-on training: Fine-tuning and model evaluation Setting up a fine-tuning pipeline using DNABERT2 and Nucleotide Transformer Training on a small dataset Evaluating and saving fine-tuned models

Target audience and requirements

Maximum number of participants: 20

This tutorial is designed for participants at different academic levels, from undergraduate students to early-career researchers, as well as scientists from both academic and industry backgrounds, who have an interest in genomic sequence analysis and machine learning.

For those who wish to engage in hands-on training, we recommend having a basic understanding of a few prerequisites:

Familiarity with running commands in a terminal.
Foundational knowledge of computer programming (preferably in Python).
A basic understanding of machine learning concepts.
Participants are also required to have access to a computer for the training.

Organizers

Aparajita Karmakar, Rosalind Franklin Institute, UK
Alexandre Paschoal, Rosalind Franklin Institute, UK
Fabiana Rodrigues de Goes, Rosalind Franklin Institute, UK

T9 - Network-medicine-based drug repurposing (full day)

Overview

Beginner level - network medicine - drug repurposing - disease mechanisms - protein-protein interactions - biological networks - computational pharmacology

In recent years, the field of network-medicine-based drug repurposing has produced a wealth of high-quality databases, algorithms, and tools designed for defining disease mechanisms and prioritising drug candidates. However, the sheer abundance of resources can be overwhelming for newcomers to this area of research.

In this tutorial on in silico network-medicine-based drug repurposing, we provide an introduction to the theoretical background and teach the hands-on skills required for getting started in this exciting and emerging research field. The tutorial covers essential concepts, data sources, algorithms, and state-of-the-art tools. Participants will be guided through all the major steps of a typical analysis workflow, including retrieving the required data from public resources, defining disease mechanisms, prioritising drug candidates, and assessing the reliability of the obtained results. These steps will be explored through commonly used tools and real-world case studies in hands-on sessions.

The tutorial concludes with a panel discussion addressing the current limitations, challenges, and future directions of the field.

Presenters

Jan Baumbach, University of Hamburg, Germany
David B. Blumenthal, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
Hryhorii Chereda, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
Fernando M. Delgado Chaves, University of Hamburg, Germany
Lisa Spindler, Technical University of Munich, Germany
Markus List, Technical University of Munich, Germany
Andreas Maier, University of Hamburg, Germany

Learning objectives

The learning objectives include getting an overview of the concepts, available resources, algorithms, and tools for in silico network-medicine-based drug repurposing as well as obtaining the hands-on skills to conduct a typical analysis workflow. This encompasses:

1. Fetching the necessary data, such as disease-associated genes, protein-protein interactions, and drug targets, from publicly available databases through their web interfaces and APIs

2. Working with popular tools for network-medicine-based disease mechanism mining and drug prioritization through Python packages, visualization tools, and an automated Nextflow pipeline

3. Assessing the reliability of the obtained results

Schedule

Time	Activity
09:00 – 10:00	Introduction to network-medicine-based drug repurposing
10:00 – 10:30	Databases for drug repurposing Protein-protein interactions (STRING, BioGRID, IntAct, IID) Disease-gene associations (DisGeNet, OMIM, OpenTargets) Drug targets (DrugBank) Integrated knowledge graph (NeDRex)
10:30 – 10:45	Coffee break
10:45 – 11:15	Databases for drug repurposing (hands-on)
11:15 – 12:15	Network-medicine tools Graphical tools (Cytoscape, Drugst.One) Disease module evaluation (DIGEST) LLMs (DrugRepoChatter, Academate, ChatDRex) Live demonstration
12:15 – 13:00	Lunch break
13:00 – 14:00	Network-medicine algorithms Disease mechanism mining (ROBUST, DOMINO, DIAMOnD) Drug prioritization (TrustRank, Proximity) Hands-on session
14:00 – 14:30	Nextflow pipeline for drug repurposing
14:30 – 14:45	Coffee break
14:45 – 15:15	Nextflow pipeline for drug repurposing (hands-on)
15:15 – 16:15	Panel discussion and wrap-up

Target audience and requirements

Maximum number of participants: 50

This tutorial is designed for researchers interested in systems biology, biological networks, network science and its algorithms, computational pharmacology, or drug repurposing. The target audience includes bioinformaticians, biologists, pharmacologists, and mathematicians.

The tutorial is tailored towards beginners, but participants should have basic programming experience (preferably in Python).

Participants are expected to have:

A basic understanding of computational biology concepts.
Familiarity with running commands in a terminal.
Basic programming experience, preferably in Python.
Access to a computer for hands-on exercises.

Organizer

Jan Baumbach, University of Hamburg, Germany

T10 - The Bgee suite: leveraging standardized and comparable transcriptomics data across animals (1/2 day - PM)

Overview

data integration – single-cell RNA-seq – cross-species comparison - data management - gene expression - open data - single-cell biology - transcriptomics

Bgee is a curated database for retrieving and comparing gene expression patterns across multiple animal species, including model species such as human, mouse, or drosophila, as well as livestock, primates, and fishes. It provides an intuitive answer to the question "where is a gene expressed?" and supports research in cancer, agriculture, and evolutionary biology.

Bgee integrates information from single-cell and bulk RNA-seq, microarrays, and in situ hybridization, offering a comprehensive view of gene expression patterns down to the cellular level. Recognised as a Global Core Biodata Resource, Bgee provides access to its data through a web interface, an R package, a SPARQL endpoint, and a JSON API.

In addition to data access, Bgee includes tools such as TopAnat for gene expression enrichment analyses and Expression Comparison for comparing gene expression states across species. This tutorial will introduce Bgee’s curated and pre-analyzed gene expression data, with a focus on single-cell and bulk RNA-seq, and demonstrate how to leverage these data in a variety of research scenarios.

Presenters

Frederic Bastian – SIB Swiss Institute of Bioinformatics, Switzerland
Alessandro Brandulas Cammarata – University of Lausanne, Switzerland
Marc Robinson-Rechavi – University of Lausanne, Switzerland
Julien Wollbrett – SIB Swiss Institute of Bioinformatics, Switzerland

Learning objectives

By the end of this tutorial, participants will:

Learn how to detect active genes from bulk and single-cell RNA-seq.
Understand the gene expression information provided by Bgee.
Appreciate the added value of Bgee’s data curation and analysis.
Learn how to use Bgee’s web tools to support research.

Schedule

Time	Activity
13:00 – 13:10	Welcome and introduction
13:10 – 13:50	Overview of Bgee Introduction to the Bgee database and its data sources How gene expression data are curated and processed Methods to access and retrieve data
13:50 – 14:30	Detecting active gene expression (1/2) Understand how gene expression data are treated in Bgee to detect signal of active gene expression over biological and technical noise, from single-cell and bulk RNASeq data. While this was already an important question for analyzing bulk RNA-Seq data, where various arbitrary thresholds were used, it is even more important for single-cell data, known to be noisy and to have a high rate of drop-outs.
14:30 – 14:45	Coffee break
14:45 – 15:05	Detecting active gene expression (2/2)
15:05 – 15:45	Data integration : Understand how data are standardized and made comparable between experiments and species. How to align gene expression states (qualitative information) and gene expression levels (quantitative information) across multiple data types and experiments. How to represent gene expression information in a knowledge graph, to accommodate different animal anatomy and biology.
15:45 – 16:10	Using Bgee tools for analysis Discover how to programmatically access gene expression data in Bgee, and how to produce knowledge from expression information with Bgee analysis tools. This includes unique tools such as TopAnat for detecting enrichment of gene expression lolization, or Expression Comparison to compare expression data between species.
16:10 – 16:15	Wrap-up and Q&A

Target audience and requirements

Maximum number of participants: 30

This tutorial is designed for researchers working in transcriptomics, evolutionary biology, computational biology, and bioinformatics who wish to leverage gene expression data for their studies. It is particularly relevant for those interested in comparative genomics, functional genomics, and systems biology.

Participants should have basic familiarity with gene expression analysis and a general understanding of biological data integration.

Organizer

Frederic Bastian – SIB Swiss Institute of Bioinformatics, Switzerland

Workshops

Workshops encourage participants to discuss technical issues, exchange research ideas, and share practical experiences on some focused or emerging topics in bioinformatics. Each tutorial will be organized as a full day (9:00 - 16:15) or a half-day event (9:00 - 12:15 or 13:00 - 16:15). If you choose a half day tutorial, you are welcome to register for another half day tutorial if space is available.

All workshops take place at the Biozentrum -the new building- (University of Basel, Spitalstrasse 41, 4056 Basel).

W1 - An introduction to the new Ensembl Genome Browser and custom annotation of variants with the Ensembl Variant Effect Predictor (VEP) (1/2 day - AM)

Overview

Ensembl Genome Browser - genomic data - gene annotation - comparative genomics - genes and genomes - genetic variation

The Ensembl Genome Browser has been an indispensable tool in the field of genomics, aiding scientists around the world in their research for over 20 years. The Ensembl database provides visualization and comprehensive analyses of integrated genomic data, including genes, variants, comparative genomics and gene regulation for over 4,500 eukaryotic and over 30,000 prokaryotic genomes. This tutorial introduces the new Ensembl platform and the Ensembl VEP, providing hands-on experience.

The first part of the tutorial explores the new Ensembl browser, familiarizing participants with the updated interface and functionalities. Attendees will learn to navigate the new website and use integrated apps to analyze various genomic data, including gene annotations, sequence variations, and comparative genomics resources.

The second part of the tutorial covers the command-line version of Ensembl VEP. Participants will learn to annotate genomic variants and interpret their effects on genes and transcripts. Participants will develop proficiency in executing Ensembl VEP commands and integrating results into research workflows through practical exercises.

This tutorial also guides users in locating help and documentation resources on Ensembl Beta and the Ensembl VEP, ensuring independent problem-solving and optimal use. Importantly, the session encourages user feedback. The Ensembl trainer will engage participants, gathering thoughts and experiences with the new website to inform future improvements. By the end, participants will have a comprehensive understanding of Ensembl Beta and VEP, gaining valuable skills for genomic research.

Learning objectives

Demonstrate the ability to navigate through the apps in the Ensembl Beta website, identifying major features and tools.
Access and interpret three different types of genomic data (e.g., gene annotations, sequence variations, and comparative genomics) using Ensembl Beta apps, as evidenced by completing practical exercises.
Execute basic Ensembl VEP commands to annotate a list of genomic variants using the command-line version.
Locate and utilize relevant help or documentation pages for Ensembl Beta and the Ensembl VEP for independent solution-finding.
Provide at least one constructive piece of feedback on the new Ensembl Beta interface, based on their experience during the tutorial.

Presenters

Jorge Batista da Rocha, EMBL’s European Bioinformatics Institute (EMBL-EBI)
Louisse Mirabueno, EMBL’s European Bioinformatics Institute (EMBL-EBI)
Aleena Mushtaq, EMBL’s European Bioinformatics Institute (EMBL-EBI)

Schedule

Time	Activity
09:00 – 09:30	Introduction to Ensembl
09:30 – 10:30	Ensembl Beta: the new Ensembl genome browser
10:30 – 10:45	Coffee break
10:45 – 11:15	Variation data in Ensembl
11:15 - 12:15	The Ensembl Variant Effect Predictor (VEP)
12:15 – 13:00	Lunch break

Target audience and requirements

Maximum number of participants: 30

The primary target is researchers, clinicians, and/or postgraduate students working in the life sciences who are interested in genomics and variant interpretation. This tutorial caters to both new users who are just beginning to explore Ensembl's capabilities and existing users who are already familiar with the database but are looking to expand their knowledge of Ensembl.While the first part of the workshop is accessible to all, the Ensembl VEP section requires a basic understanding of shell/bash scripting. It is essential for participants to engage fully with and benefit from the Ensembl VEP tutorial.

By accommodating various skill levels and providing clear prerequisites, the workshop ensures that participants can make the most of the learning experience, regardless of their prior exposure to Ensembl.

Organizers

Jorge Batista da Rocha, EMBL’s European Bioinformatics Institute (EMBL-EBI)
Louisse Mirabueno, EMBL’s European Bioinformatics Institute (EMBL-EBI)
Aleena Mushtaq, EMBL’s European Bioinformatics Institute (EMBL-EBI)

Comment

Prior to the tutorial, all registrants will receive a pre-course survey. This survey is designed to gather information about participants' backgrounds and research interests, allowing the Ensembl trainer to tailor the course content accordingly. By customizing the tutorial based on this feedback, we aim to provide a more relevant and engaging learning experience for all participants.

For any additional queries or concerns not addressed in the survey, participants are encouraged to reach out directly to the Ensembl training team via email at helpdesk@ensembl.org.

W2 - FAIR conceptual and technical infrastructures facilitating high quality health data access for research (full day)

Overview

health data - FAIR principles - clinical - data management - semantic web format - standards - guidelines

In the last decade, many European countries have started to invest in developing national infrastructures facilitating the access to health data for research. These initiatives focus on several components: conceptual frameworks, technical implementations and legal foundations, guided by the FAIR (Findable, Accessible, Interoperable, Reusable) principles and developed in collaboration with the relevant stakeholders. Examples of such initiatives include the Swiss Personalized Health Network (SPHN) in Switzerland or the Medical Informatics Initiative (MII) in Germany.

Still, health data remains challenging to integrate and share, even within single countries. Diverse systems, standards, regulations, and languages – especially in multilingual contexts like in Switzerland – add layers of complexity. Moreover, the sensitive nature of health data requires robust and secure environments for data management and sharing with stakeholders. Each initiative has adopted different solutions, sometimes with overlapping components, such as data formats (e.g. terminologies and standards), conceptual modeling (Semantic Web, FHIR, OMOP), and secure infrastructures that are tailored to their national legislation and framework conditions. While the implementations vary, these initiatives share the common goal to make health data accessible for research at a good quality.

In this workshop, we will present the different approaches, with a central focus given to conceptual decisions and technical solutions implemented. We will highlight the success stories of knowledge graphs within these IT (Information Technology) ecosystems, but also discuss their limitations and shared challenges. Some of the key obstacles include semantic discrepancies in data capture, varying levels of data maturity across data sources, and inconsistent provenance documentation. Addressing these issues requires more than just building technical solutions, it needs a fundamental rethinking and restructuring of data practices at their very core.

We will furthermore emphasize the critical importance of capturing high-quality data at the source rather than retrofitting provenance and quality information later. This should enhance

data trustworthiness, reduce loss of information, ease (international) interoperability, and ensure a better alignment with the FAIR principles. For instance, the MIRAPIE (MInimal Requirements for Automated Provenance Information Enrichment) community, involving MII and SPHN members, has established guidelines for recording provenance information for biomedical data to support these efforts.

Presenters

Caroline Bönisch | Stralsund University of Applied Sciences, Germany
Sabine Österle | SIB Swiss Institute of Bioinformatics, Switzerland
Vasundra Touré | SIB Swiss Institute of Bioinformatics, Switzerland
Judith Wodke | Greifswald University Hospital, Germany
Wouter Franke | the Hyve, Netherlands
Núria Queralt Rosinach | Leiden University Medical Center, Netherlands
Alban Gaignard | Unité de Recherche de l'Institut du Thorax, France

Tentative agenda

Time	Activity
09:00 – 09:10	Welcome and introduction
09:10 – 10:30	Session 1 – National strategies for building digital infrastructures for FAIR health data The Swiss Personalized Health Network in Switzerland, Sabine Österle Health research data infrastructures in Germany, Caroline Bönisch / Judith Wodke Navigating data initiatives in the Netherlands, Wouter Franke
10:30 – 10:45	Coffee break
10:45 – 12:05	Session 2 – Technical implementations: knowledge graphs vs. FHIR vs. other “models” The SPHN Semantic Interoperability Framework, Vasundra Touré Knowledge graph-based discovery for translational research, Núria Queralt Rosinach From FHIR towards Knowledge Graphs, Judith Wodke
12:15 – 13:00	Lunch break
13:00 - 14:40	Session 3 – Data provenance and the MIRAPIE guidelines Data provenance for biomedical data: introduction and the MIRAPIE guidelines, Judith Wodke SPHN use case of MIRAPIE, Vasundra Touré French use case of MIRAPIE, Alban Gaignard
14:40 - 14:55	Coffee break
14:55 - 16:25	Session 4 – Open panel discussion, moderator: Vasundra Touré Making clinical data work for research: challenges, misconceptions and solutions Panelists: Sabine Österle, Judith Wodke, Wouter Franke
16:25 – 16:35	Discussions and closing remarks

Target audience and requirements

Maximum number of participants: 50

This workshop addresses clinical researchers and data scientists.

Organizer

Sabine Österle, SIB Swiss Institute of Bioinformatics, Switzerland

W3 - Functional homology and molecular similarity: prospects and pitfalls in data-driven protein & drug discovery (full day)

Overview

database curation - drug design - machine learning - protein engineering

Proteins are often compared based on their sequence similarity or identity, a measure of how closely related two protein sequences are. While high sequence identity often suggests similar structure and function, even proteins with low sequence identity can display structural homology. This phenomenon arises because protein structures are more conserved than their sequences, reflecting evolutionary pressures to maintain three-dimensional folds critical for function. However, structural homology alone does not guarantee functional homology, which depends on the topology and physicochemical properties of a protein’s functional sites.

This relationship between sequence, structure, and function creates both opportunities and challenges in data-driven protein and drug discovery. On the opportunity side, functionally homologous proteins with distinct properties can be identified—such as process-stable enzymes with similar binding pockets that retain comparable activities, making them suitable for diverse applications. Conversely, the risks are significant: overlooking structural and functional homologs can result in "contaminated" datasets for machine learning. Similarity-based data leakage in training and test datasets may distort model performance metrics, undermining the robustness and reliability of predictive tools.

Molecular similarity of ligands further complicates this landscape. Ligands with similar chemical structures often exhibit comparable biological activities, but small differences in their properties can lead to divergent interactions, specificities, or off-target effects. Properly interpreting and leveraging molecular similarity is essential for designing robust workflows in data-driven discovery.

Learning objectives

In this workshop, we will explore how functional homology and molecular similarity can be harnessed effectively while minimizing pitfalls.

Through case studies and hands-on tasks, participants will learn to:

Discriminate between structural and functional homology
Browse for structurally and functionally homologous proteins
Evaluate the risk of protein homology & ligand similarity for ML
Identify data leakage between train and test datasets

By the end of the workshop, you will be equipped to critically assess and apply concepts of functional homology and molecular similarity in your research, empowering you to navigate the complexities of data-driven protein and drug discovery with greater confidence.

Presenters

David Graber | ETH Zürich, Switzerland
Peter Stockinger | ZHAW Zurich University of Applied Sciences, Switzerland

Schedule

Time	Activity
09:00 – 10:30	Introduction to the concept and relevance of structural & functional homology (talk & demo by Peter Stockinger)
10:30 – 10:45	Coffee break
10:45 – 12:15	Similarity-induced data leakage results in overestimated ML model performance (talk by David Graber) & introduction to the practical hands-on session
12:15 - 13:00	Lunch break
13:00 - 14:30	Hands-on session in subgroups
14:30 - 14:45	Coffee break
14:45 - 16:15	Group presentations, wrap-up, outlook

Target audience and requirements

Maximum number of participants: 15

This workshop is designed for bioinformaticians, and data scientists aiming to advance their discovery methodologies and improve the generalizability of their models.

Participants should bring personal laptops equipped with a functional Python 3 environment, ideally set up with Anaconda3 or Miniconda3.

The necessary datasets and code for the hands-on session will be made available in advance via a GitHub repository. Participants are expected to contribute their hands-on session results to this repository.

Additionally, there is an optional goal to compile all insights and findings into a whitepaper for publication on bioRxiv, with participants invited to join as co-authors.

Organizers

Rebecca Buller, ZHAW Zurich University of Applied Sciences, Switzerland
David Graber, ETH Zürich, Switzerland
Peter Stockinger, ZHAW Zurich University of Applied Sciences, Switzerland

W4 - Measuring what matters: quantitative comparisons of institutional equality, diversity, and inclusion actions (1/2 day - PM)

Overview

equality - diversity - inclusion - EDI - institutional Governance - community - training

The integration of Equality, Diversity, and Inclusion (EDI) policies and best practices into institutional governance structures is increasingly recognised as essential for community cohesion and belonging. However, some actions are more effective and more visible than others, and different organisations apply different approaches to integrating EDI practices.

Taking advantage of being a federated institution that connects members from many different organisations across Switzerland, the SIB Swiss Institute of Bioinformatics EDI Focus Group has carried out a quantitative participatory comparison of equality, diversity, and inclusion actions across its member organisations. The workshop session aims to present a summary of the methodology applied and the key results from mapping the status of SIB-affiliates' activities in the EDI domain. This overview will then provide the basis for an interactive live assessment of the organisations to which the workshop participants are affiliated. The results from the Swiss level and those emerging from the workshop exercises will lead to discussions that foster the cross-pollination of ideas on EDI actions and perceptions amongst participants.

Learning objectives

Instead of a typical bioinformatics workshop where the learning objectives might be focused on understanding new data types, new software, or new databases, here the focus is on learning through sharing experiences across different research organisations. The aim of presenting the methodology is to expose participants to a simple yet effective model that could be replicated and/or adapted in other organisational settings. Through the workshop activities, participants are expected to develop a clearer picture of how they might enhance the current status of EDI integration efforts at their own institution.

Presenter

Robert Waterhouse, SIB Swiss Institute of Bioinformatics, Switzerland

Schedule

Time	Activity
13:00 – 13:30	Presentation of the methodology and results of the SIB assessment
13:30 – 14:30	Interactive live assessment of the organisations represented
14:30 – 14:45	Coffee break
14:45 – 15:30	Collective review of the results from the live assessment
15:30 – 15:55	Roundtable discussions on participants' perspectives and actions
15:55 - 16:15	Conclusion

Target audience and requirements

Maximum number of participants: 30

This workshop is aimed at researchers, including students, who are currently involved in or wish to initiate actions related to EDI at their institutions, regardless of prior experience. It is also relevant to researchers in leadership roles seeking to learn from the comparative analysis of EDI practices across different organisations.

Organiser

Robert Waterhouse, SIB Swiss Institute of Bioinformatics, Switzerland

Evaluation committee

Diana Marek Training Manager, SIB Training Group | SIB Swiss Institute of Bioinformatics, Switzerland
Charlyne Bürki, PHD Student at the Computational Evolution lab, ETH D-BSSE | EPFL SV alumni, ETH Zurich
Valentyn Bezshapkin, PhD Student in Sunagawa Lab | ETH Zürich

The [BC]² tutorial and workshop session is coordinated by the SIB Training Group. For more information please contact bc2@sib.swiss.

Rules and responsibilities

On-site participation and presentation

All tutorials and workshops will take place onsite in Basel. To foster exchanges and interactivity during the sessions, participants, organizers and presenters must be present in-person
In limited cases, the evaluation committee can approve the virtual participation of a tutorial/workshop presenter if unable to travel or attend onsite. Please indicate the reason for virtual attendance when submitting

[BC]²Registration fees - travel and accommodation costs

Each tutorial/workshop will receive up to two free conference registrations for organizers or speakers.
[BC]² will not cover travel and accommodation costs from tutorial/workshop organizers or presenters (read below).

[BC]²will be responsible for:

Providing a meeting venue with necessary technical equipment and catering services during coffee breaks
Providing staff to help with the on-site/online organization
Announcing the detailed schedule of the tutorial/workshop on the conference website
Advertising on [BC]² social media (based on material/info received from organizers). FYI, a social media banner template will be provided to organizers to ensure consistency with the visual identity of the conference
Update organizers, on a monthly basis, about the number of registered participants
[BC]² and SIB Training Group reserve the right to cancel a scheduled tutorial/workshop if registration one month before the conference is less than 10 participants

Tutorial / workshop organizers will be responsible for:

Finding financial support for the organization of the tutorial/workshop. Tutorial/workshop organizers are highly encouraged to seek independent funding for travel and accommodation of their speakers / presenters.
Advertising the tutorial/workshop and distributing its call for papers/participation (if applicable). This includes promoting their event to relevant newsgroups and mailing lists, and especially to potential audiences from outside the core [BC]² conference community.
Finalising the programme/detailed schedule (incl. name of speakers) and providing material by the specific conference deadlines (see key dates)
Compiling and distributing material to the participants (if applicable). Note that you will be able to upload material as pdf on the conference website upon request and that no photocopying of handouts will be managed by the conference.
Leading the event at [BC]².

Open and FAIR

Note that organizers are responsible for ensuring that the tutorial and workshop materials are legally used, and that appropriate copyright permissions have been arranged. Lecturers will guarantee that tutorial and workshop materials are as much as possible open and FAIR. They agree that their material may be made available in any form by the conference to [BC]² conference participants.