IXXI, Scydolise and BioSyL, three Lyon's federations in complex systems, systems biology and machine learning have decided to join their forces to organize a one-day online workshop on AI - Bio on October, the 1st.
This will be a zoom-based meeting. Please register here and we will send you the connection instructions just before the start of the meeting
The following program has been drafted by Franck PICARD (ENS/CNRS/Scydolise/IXXI) and Olivier GANDRILLON (ENS/CNRS/BioSyL/IXXI):
9:00 - 9h45 Jean-Phillipe Vert (France)
Google Brain/Mines ParisTech
Machine learning for single-cell omics data
9h45 - 10h30 Stefan Bonn (Germany)
The Medical Center Hamburg-Eppendorf (UKE)
De-noising and deconvolution of omics data using deep learning
Technological advances allow us to scrutinize gene and protein expression from single cells, even in spatial and temporal resolution. These advances come at a cost, as the analysis of such complex data requires novel algorithmic approaches to extract the biological information within. In my talk I will highlight pertinent problems and our recent advances in de-noising, integrating and deconvolving biomedical omics data.
10h30 - 11h00 Break
11h00 – 11h45 Laure Blanc-Féraud (France)
CNRS Sophia Antipolis
Some advances in fluorescence microscopy super-resolution.
The resolution of optical fluorescence microscopes is limited due to light diffraction. This physical barrier restricts the resolution of microscpy images to roughly 200-250nm in the lateral plane and 400nm in the optical axis. For the last fifteen years, different super-resolution techniques have been developed, using specific fluorohpores or/and specific illumination, and adapted digital processing to reconstruct super-resolved images, bypassing the diffraction limits. This states difficult numerical reconstruction problems for which we have developed parsimonious optimization methods. We will present some acquisition systems and associated reconstruction algorithms with results on biological data.
11h45 – 12h30 Vera Pancaldi (France)
CRCT - Toulouse
Data integration and cellular network approaches to characterise the tumour microenvironment
The complex interrelations of cells in the tumour micro-environment (TME) remain hidden in population bulk datasets. Computational cell type deconvolution approaches can be used to list the types and amounts of cells present in each tumour sample, either based purely on transcriptomics, or on chromatin level information [1]. However, these approaches remain approximate and new single-cell methods are bound to make population-based approaches obsolete.
Single-cell techniques can identify even the rarest cell types, but they remain extremely complex, expensive – hence impractical to date in clinical settings – and generally fail to capture important TME spatial characteristics.
A powerful complementary approach to these sequencing-based approaches is given by imaging. Using specific antibodies in immunofluorescence multiplex imaging, proteins are detected in single cells in tissues, defining cell identity and phenotype in a spatial context.
We have recently been working on a detailed characterization of tumour infiltrating cells in solid cancers (pancreatic and lung cancer) as well as using ex-vivo and in-vitro simplified models of the tumour microenvironment. We have developed tools to process biological images to identify different cell types through cell surface markers and construct tissue networks models in which each node will be a cell, annotated with type and phenotypic state, and interactions represent cell contacts [2].
The ultimate goal remains that of identifying specific features of patient samples that can allow us to predict response to immunotherapies, which can be very successful to arrest tumour growth but only work in a relatively reduced subset of patients.
References:
Xie et al. GEM-DeCan: Improved tumor immune microenvironment profiling through novel gene expression and DNA methylation signatures predicts immunotherapy response BioRxiv https://doi.org/10.1101/2021.04.09.439207 2021
Coullomb A. and Pancaldi. V, Tysserand - Fast and accurate reconstruction of spatial networks from bioimages BioRxiv https://doi.org/10.1101/2020.11.16.385377 2019
12h30 – 14h00 Break
14h00 - 14h45 Mathieu Carrière (France)
Inria Sophia Antipolis
Topology identifies emerging adaptive mutations in SARS-CoV-2
The COVID-19 pandemic has lead to a worldwide effort to characterize its evolution through the mapping of mutations in the genome of the coronavirus SARS-CoV-2. Ideally, one would like to quickly identify new mutations that could confer adaptive advantages (e.g. higher infectivity or immune evasion) by leveraging the large number of genomes. One way of identifying adaptive mutations is by looking at convergent mutations, mutations in the same genomic position that occur independently. However, the large number of currently available genomes precludes the efficient use of phylogeny-based techniques. Here, we establish a fast and scalable Topological Data Analysis approach for the early warning and surveillance of emerging adaptive mutations based on persistent homology. It identifies convergent events merely by their topological footprint and thus overcomes limitations of current phylogenetic inference techniques. This allows for an unbiased and rapid analysis of large viral datasets which can help to develop alert systems to monitor mutations of concern and guide experimentalists to focus the study of specific circulating variants.
14h45- 15h30 Nelle Varoquaux (France)
TIMC - Grenoble
Inference of genome 3D architecture by modeling overdispersion of Hi-C data/div>
The spatial and temporal organization of the 3D structure of chromosomes is thought to have an important role in genomic function, but is poorly understood. For example, there is a relative paucity of specific transcription factors, and an abundance of chromatin remodeling enzymes in the deadly human parasite P. falciparum. This points towards the involvement of global and local chromatin structure to control gene expression. Advances in chromosome conformation capture (3C) technologies, initially developed to assess interactions between specific pairs of loci, allow one to simultaneously measure multiple contacts on a genome scale, paving the way for more systematic and genome-wide analysis of the 3D architecture of the genome. These new Hi-C techniques result in a genome-wide contact map, a matrix indicating the contact frequency between pairs of loci.
We address the challenge of inferring a consensus 3D model of genome architecture from Hi-C data. Existing approaches most often rely on a two step algorithm: first convert the contact counts into distances, then optimize an objective function akin to multidimensional scaling (MDS) to infer a 3D model. Other approaches use a maximum likelihood approach, modeling the contact counts between two loci as a Poisson random variable whose intensity is a decreasing function of the distance between them. However, a Poisson model of contact counts implies that the variance of the data is equal to the mean, a relationship that is often too restrictive to properly model count data. We propose a new model, called Pastis-NB, where we replace the Poisson model of contact counts by a negative binomial one, which is parametrized by a mean and a separate dispersion parameter. The dispersion parameter allows the variance to be adjusted independently from the mean, thus better modeling overdispersed data.
15h30 – 16h00 Break
16h00 – 16h45 Hugo Lavenant (Italy)
Bocconi University - Milan
Using optimal transport for trajectory inference
I will present a work where we devise a theoretical framework and a numerical method to infer trajectories of a stochastic process from samples of its temporal marginals. This problem arises in the analysis of single cell RNA-sequencing data, which provide high dimensional measurements of cell states but cannot track the trajectories of the cells over time. The central idea is to use optimal transport to couple different temporal marginals. We provide guarantees about when this method can recover the ground truth, in particular in a regime of sparse data when each marginal is known poorly. From the practical point view, our method (Global) Waddington-OT boils down to a smooth convex optimization problem which can be solved efficiently. This work is done in collaboration with Stephen Zhang, Young-Heon Kim and Geoffrey Schiebinger, from UBC.
16h45 – 17h30 Patrick Stumpf (Netherland)
RWTH Aachen University
Illuminating the dynamics of stem cell differentiation and ageing
The molecular mechanisms underpinning stem cell self-renewal and differentiation are complex and hard to dissect using conventional statistical approaches. Machine learning can reduce this complexity and it can help extract information on cellular dynamics from single-cell data. I will present examples how such information can be used to understand stem cell ageing and the regulatory control of cellular differentiation.