Loading Events

« All Events

Zero-Shot Foundation Model for a Universal Gene Expression Atlas of Human Tissue: Unveiling Clinically Relevant Cell States and Disease-Specific Spatial Niches

Statistical Bioinformatics Seminar
Dr Xiaomeng Wan, HKUST

March 3 @ 1:00 pm 2:00 pm

The rapid accumulation of single-cell datasets from diverse organs and tissues presents significant opportunities for understanding complex diseases, yet challenges remain in effectively analyzing this wealth of information and further leveraging it to various data types, including spatial transcriptomics (ST) and bulk RNA-seq datasets. Here, we introduce UniGeneX, a generative single-cell foundation model designed to reconstruct a universal gene expression profile from extensive transcriptomic data. UniGeneX minimizes batch effects while preserving biological variability, enabling the identification of shared gene programs across tumor samples. By providing consistent cell type labels and leveraging biological patterns from training data, UniGeneX facilitates the discovery of disease-specific cell niches in spatial and key cell states associated with clinical outcomes. Our model addresses existing limitations in current single-cell foundation models by focusing on a universal gene expression framework rather than merely learning embeddings for downstream tasks. We demonstrate the effectiveness of UniGeneX in characterizing disease-relevant cell states in glioma and idiopathic pulmonary fibrosis (IPF), ultimately advancing our understanding of the mechanisms underlying complex diseases.

Find out more about the Statistical Bioinformatics seminar series

Dr Xiaomeng Wan

Dr Xiaomeng Wan is currently a Postdoctoral Associate in the Department of Mathematics at the Hong Kong University of Science and Technology (HKUST), under the guidance of Prof Can Yang. She earned her PhD from HKUST under the mentorship of Prof Can Yang. Her research centres on statistical machine learning and deep learning, particularly exploring their applications in the analysis of transcriptomics datasets.