TranSYS ESR 05 Andrew Walakira, at the University of Ljubljana, announces latest publication on ‘Detecting gene–gene interactions from GWAS using diffusion kernel principal components’
Genome-wide association studies yield genotype (SNP) data. Analysing this type of data is challenging because of the high dimensionality of the data, the data is non linear, there are dependencies between SNPs, the multiplicity correction burden, and interpretation of the results is difficult.
The main focus of this work is to present a new approach to GWAS data analysis. Instead of looking at SNP-SNP interactions, we aggregate information across SNPs within each gene using kernel PCA and a diffusion kernel. Hence we effectively capture the non-linearity in the data while at the same time reducing the dimensionality of the data without filtering out SNPs. Reducing the dimensionality of the data reduces both the computational burden and the multiplicity correction burden. Furthermore, using diffusion kernels on trait-informed graphs for each gene leads to “stronger” gene modules as new units of analysis for epistasis analysis or trait prediction purposes.
Our approach is computationally efficient, takes in SNP data but yields gene level results which are easy to interpret, and yields gene summaries that can be used as intergrative modules of analysis. The full publication is available at: https://doi.org/10.1186/s12859-022-04580-7.
About the author: