Kim-Anh Lê Cao
Associate Professor, Statistical Genomics
School of Mathematics and Statistics, The University of Melbourne, Australia
Multivariate integration of multi-omics data
Abstract:
Technological improvements have allowed for the collection of data from different molecular compartments (e.g. gene expression, protein abundance) resulting in multiple ‘omics data from the same set of biospecimens or individuals (e.g. transcriptomics, proteomics). We propose to adopt a systems biology holistic approach by statistically integrating data from these multi-omics. Such approach provides improved biological insights compared with traditional single omics analyses, as it allows to take into account interactions between omics layers.
Integrating data include numerous challenges – data are complex and large, each with few samples (< 50) and many molecules (> 10,000), and generated using different technologies. We have developed a comprehensive dimension reduction multivariate framework to address some of these challenges in the R package mixOmics. I will give a broad overview of the different methods implemented in the package, and how we define statistical data integration in this context. I will then illustrate how we applied these approaches for the analyses of different multi-omics studies, ranging from a human newborns study to multi-omics microbiomes as well as some work in single cell multi-omics. Across all these studies, our main goal is to identify a signature composed of biological markers of different types to characterise a specific phenotype or disease status, and thus better understand the underlying molecular mechanisms of a biological system.
Key references:
Rohart F, Gautier B, Singh A, Lê Cao K-A (2017). mixOmics: an R package for ‘omics feature selection and multiple data integration. PLoS Comput Biol 13(11): e1005752.
Lê Cao K-A. and Welham Z (2021). Multivariate Data Integration Using R: Methods and Applications with the mixOmics package. CRC Chapman & Hall.
Biography:
A/Prof Kim-Anh Lê Cao develops computational methods, software and tools to interpret big biological data and answer research questions efficiently. Kim-Anh has a mathematical engineering background and graduated with a PhD in statistics from the Université de Toulouse, France. She then moved to Australia to forge her own non-linear career path, first working as a biostatistician consultant at QFAB Bioinformatics, then as a research group leader at the biomedical University of Queensland Diamantina Institute. She currently continues her strong research focus at the University of Melbourne. Kim-Anh has secured two consecutive NHMRC fellowships from 2014. In 2019 she received the Australian Academy of Science’s Moran Medal for her contributions to Applied Statistics. She was selected to the international HomewardBound leadership program for women in STEMM, culminating to a trip to Antarctica in 2019, and the superstars of STEM program from Science Technology Australia.