Single-cell RNA sequencing (scRNA-seq) enables high-resolution analysis of
cellular heterogeneity, but its complexity, which is marked by high
dimensionality, sparsity, and batch effects, which poses major computational
challenges. Transformer-based models have made significant advances in this
domain but are often limited by their quadratic complexity and suboptimal
handling of long-range dependencies. In this work, we introduce GeneMamba, a
scalable and efficient foundation model for single-cell transcriptomics built
on state space modeling. Leveraging the Bi-Mamba architecture, GeneMamba
captures bidirectional gene context with linear-time complexity, offre
substantial computational gains over transformer baselines. The model is
pretrained on nearly 30 million cells and incorporates biologically informed
objectives, including pathway-aware contrastive loss and rank-based gene
encoding. We evaluate GeneMamba across diverse tasks, including multi-batch
integration, cell type annotation, and gene-gene correlation, demonstrating
strong performance, interpretability, and robustness. These results position
GeneMamba as a practical and powerful alternative to transformer-based methods,
advancing the development of biologically grounded, scalable tools for
large-scale single-cell data analysis.
Cet article explore les excursions dans le temps et leurs implications.
Télécharger PDF:
2504.16956v1