Towards multimodal foundation models in molecular cell biology

Cui, Haotian; Tejada-Lapuerta, Alejandro; Brbić, Maria; Saez-Rodriguez, Julio; Cristea, Simona; Goodarzi, Hani; Lotfollahi, Mohammad; Theis, Fabian J.; Wang, Bo

doi:10.1038/s41586-025-08710-y

Perspective
Published: 16 April 2025

Towards multimodal foundation models in molecular cell biology

Nature volume 640, pages 623–633 (2025)Cite this article

14k Accesses
145 Altmetric
Metrics details

Subjects

Abstract

The rapid advent of high-throughput omics technologies has created an exponential growth in biological data, often outpacing our ability to derive molecular insights. Large-language models have shown a way out of this data deluge in natural language processing by integrating massive datasets into a joint model with manifold downstream use cases. Here we envision developing multimodal foundation models, pretrained on diverse omics datasets, including genomics, transcriptomics, epigenomics, proteomics, metabolomics and spatial profiling. These models are expected to exhibit unprecedented potential for characterizing the molecular states of cells across a broad continuum, thereby facilitating the creation of holistic maps of cells, genes and tissues. Context-specific transfer learning of the foundation models can empower diverse applications from novel cell-type recognition, biomarker discovery and gene regulation inference, to in silico perturbations. This new paradigm could launch an era of artificial intelligence-empowered analyses, one that promises to unravel the intricate complexities of molecular cell biology, to support experimental design and, more broadly, to profoundly extend our understanding of life sciences.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Multimodal analytical technologies and their applications.**

**Fig. 2: Diverse data context in pretraining and iterative improvement by lab-in-the-loop.**

**Fig. 3: Computational components of multimodal foundation models.**

**Fig. 4: Potential training tasks and challenges.**

scGPT: toward building a foundation model for single-cell multi-omics using generative AI

Article 26 February 2024

Large-scale foundation model on single-cell transcriptomics

Article 06 June 2024

Delineating the effective use of self-supervised learning in single-cell genomics

Article Open access 27 December 2024

References

Alberts, B. et al. Molecular Biology of the Cell 6th edn (W. W. Norton, 2020).
Keller, E. F. Making Sense of Life: Explaining Biological Development with Models, Metaphors, and Machines (Harvard Univ. Press, 2002).
Barabási, A.-L. & Oltvai, Z. N. Network biology: understanding the cell’s functional organization. Nat. Rev. Genet. 5, 101–113 (2004). A seminal review on network biology, elucidating how molecular interactions shape cellular and organismal function.
Article PubMed Google Scholar
Karlebach, G. & Shamir, R. Modelling and analysis of gene regulatory networks. Nat. Rev. Mol. Cell Biol. 9, 770–780 (2008).
Article CAS PubMed Google Scholar
Goldberg, A. P. et al. Emerging whole-cell modeling principles and methods. Curr. Opin. Biotechnol. 51, 97–102 (2018).
Article CAS PubMed Google Scholar
Johnson, G. T. et al. Building the next generation of virtual cells to understand cellular biology. Biophys. J. 122, 3560–3569 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Karr, J. R., Takahashi, K. & Funahashi, A. The principles of whole-cell modeling. Curr. Opin. Microbiol. 27, 18–24 (2015).
Article CAS PubMed Google Scholar
Freddolino, P. L. & Tavazoie, S. The dawn of virtual cell biology. Cell 150, 248–250 (2012).
Article CAS PubMed PubMed Central Google Scholar
Georgouli, K., Yeom, J.-S., Blake, R. C. & Navid, A. Multi-scale models of whole cells: progress and challenges. Front. Cell Dev. Biol. 11, 1260507 (2023).
Article PubMed PubMed Central Google Scholar
Karr, J. R. et al. A whole-cell computational model predicts phenotype from genotype. Cell 150, 389–401 (2012).
Article CAS PubMed PubMed Central Google Scholar
HuBMAP Consortium. The human body at cellular resolution: the NIH Human Biomolecular Atlas Program. Nature 574, 187–192 (2019).
Article ADS CAS Google Scholar
Hasin, Y., Seldin, M. & Lusis, A. Multi-omics approaches to disease. Genome Biol. 18, 83 (2017). The potential of multi-omics in uncovering molecular underpinnings of diseases and informing precision medicine.
Article PubMed PubMed Central Google Scholar
Regev, A. et al. Science Forum: the Human Cell Atlas. eLife https://doi.org/10.7554/eLife.27041 (2017). An introduction of the HCA initiative, a pivotal project for mapping cellular diversity across human tissues.
Rozenblatt-Rosen, O. et al. The Human Tumor Atlas Network: charting tumor transitions across space and time at single-cell resolution. Cell 181, 236–249 (2020).
Article CAS PubMed PubMed Central Google Scholar
Stoeckius, M. et al. Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods 14, 865–868 (2017).
Article CAS PubMed PubMed Central Google Scholar
Ma, S. et al. Chromatin potential identified by shared single-cell profiling of RNA and chromatin. Cell 183, 1103–1116.e20 (2020).
Article CAS PubMed PubMed Central Google Scholar
Deng, Y. et al. Spatial-CUT&Tag: spatially resolved chromatin modification profiling at the cellular level. Science 375, 681–686 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Swanson, E. et al. Simultaneous trimodal single-cell measurement of transcripts, epitopes, and chromatin accessibility using TEA-seq. eLife 10, e63632 (2021).
Article CAS PubMed PubMed Central Google Scholar
Mimitou, E. P. et al. Scalable, multimodal profiling of chromatin accessibility, gene expression and protein levels in single cells. Nat. Biotechnol. 39, 1246–1258 (2021).
Article CAS PubMed PubMed Central Google Scholar
Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616, 259–265 (2023).
Article ADS CAS PubMed Google Scholar
Bommasani, R. et al. On the opportunities and risks of foundation models. Preprint at https://arxiv.org/abs/2108.07258 (2021). An overview of the concept, opportunities and challenges of foundation models for diverse artificial intelligence applications.
Vaswani, A. et al. Attention is all you need. Preprint at https://arxiv.org/abs/1706.03762 (2017). An introduction of the transformer architecture, the cornerstone of modern foundation models.
Brown, T. et al. Language models are few-shot learners. In Proc. 34th International Conference on Neural Information Processing Systems 1877–1901 (Curran Associates Inc., 2020). An introduction of GPT-3, a 175-billion parameter language model demonstrating strong few-shot learning capabilities across diverse natural language processing tasks.
Ouyang, L. et al. Training language models to follow instructions with human feedback. In Proc. 36th International Conference on Neural Information Processing Systems 27730–27744 (Curran Associates Inc., 2022).
Touvron, H. et al. LLaMA: open and efficient foundation language models. Preprint at https://arxiv.org/abs/2302.13971 (2023). An introduction to LLaMA, a suite of open-source language models (7B to 65B parameters) trained on publicly available data.
Touvron, H. et al. Llama 2: open foundation and fine-tuned chat models. Preprint at https://arxiv.org/abs/2307.09288 (2023).
llama3: The official meta Llama 3 GitHub site. GitHub https://github.com/meta-llama/llama3 (2024).
Rombach, R., Blattmann, A., Lorenz, D., Esser, P. & Ommer, B. High-resolution image synthesis with latent diffusion models. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 10674–10685 (IEEE/CVF, 2022).
Podell, D. et al. SDXL: improving latent diffusion models for high-resolution image synthesis. Preprint at https://arxiv.org/abs/2307.01952 (2023).
Blattmann, A. et al. Stable video diffusion: scaling latent video diffusion models to large datasets. Preprint at https://arxiv.org/abs/2311.15127 (2023).
Liu, H., Li, C., Wu, Q. & Lee, Y. J. Visual instruction tuning. In Proc. 37th International Conference on Neural Information Processing Systems 34892–34916 (Curran Associates Inc., 2023).
Eraslan, G., Simon, L. M., Mircea, M., Mueller, N. S. & Theis, F. J. Single-cell RNA-seq denoising using a deep count autoencoder. Nat. Commun. 10, 390 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Li, R., Li, L., Xu, Y. & Yang, J. Erratum to: Machine learning meets omics applications and perspectives. Brief. Bioinform. 23, bbab560 (2022).
Article PubMed Google Scholar
Klein, D. et al. Mapping cells through time and space with moscot. Nature 638, 1065–1075 (2025).
Brbić, M. et al. MARS: discovering novel cell types across heterogeneous single-cell experiments. Nat. Methods 17, 1200–1206 (2020).
Article PubMed Google Scholar
Brbić, M. et al. Annotation of spatially resolved single-cell data with STELLAR. Nat. Methods 19, 1411–1418 (2022).
Article PubMed Google Scholar
Lotfollahi, M., Wolf, F. A. & Theis, F. J. scGen predicts single-cell perturbation responses. Nat. Methods 16, 715–721 (2019).
Article CAS PubMed Google Scholar
Lotfollahi, M. et al. Predicting cellular responses to complex perturbations in high-throughput screens. Mol. Syst. Biol. 19, e11517 (2023).
Article CAS PubMed PubMed Central Google Scholar
Roohani, Y., Huang, K. & Leskovec, J. Predicting transcriptional outcomes of novel multigene perturbations with GEARS. Nat. Biotechnol. 42, 927–935 (2024). A deep learning model integrating gene–gene relationship knowledge graphs to predict transcriptional responses.
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021). An introduction to AlphaFold, a deep learning model achieving near-experimental accuracy in predicting protein structures.
Article ADS CAS PubMed PubMed Central Google Scholar
Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Lin, Z. et al. Language models of protein sequences at the scale of evolution enable accurate structure prediction. Preprint at bioRxiv https://doi.org/10.1101/2022.07.20.500902 (2022).
ESM3: simulating 500 million years of evolution with a language model. EvolutionaryScale https://www.evolutionaryscale.ai/blog/esm3-release (2024). A frontier language model for biology that simultaneously reasons over the sequence, structure and function of proteins.
Avsec, Ž. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18, 1196–1203 (2021).
Article CAS PubMed PubMed Central Google Scholar
Cui, H. et al. scGPT: towards building a foundation model for single-cell multi-omics using generative AI. Nat. Methods 21, 1470–1480 (2024). The development of scGPT, a generative pre-trained transformer model, leveraging over 33 million single-cell datasets to advance single-cell biology.
Article CAS PubMed Google Scholar
Theodoris, C. V. et al. Transfer learning enables predictions in network biology. Nature 618, 616–624 (2023). A large model pretrained on 30 million single-cell transcriptomes, facilitating accurate predictions in gene network biology.
Article ADS CAS PubMed PubMed Central Google Scholar
Yang, F. et al. scBERT as a large-scale pretrained deep language model for cell type annotation of single-cell RNA-seq data. Nat. Mach. Intell. 4, 852–866 (2022).
Article Google Scholar
Wang, H. et al. Scientific discovery in the age of artificial intelligence. Nature 620, 47–60 (2023).
Article ADS CAS PubMed Google Scholar
Sverchkov, Y. & Craven, M. A review of active learning approaches to experimental design for uncovering biological networks. PLoS Comput. Biol. 13, e1005466 (2017).
Article ADS PubMed PubMed Central Google Scholar
Szymanski, N. J. et al. An autonomous laboratory for the accelerated synthesis of novel materials. Nature 624, 86–91 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Foster, A., Ivanova, D. R., Malik, I. & Rainforth, T. Deep adaptive design: amortizing sequential Bayesian experimental design. In Proc. 38th International Conference on Machine Learning Vol. 139 3384–3395 (PMLR, 2021).
Rainforth, T., Foster, A., Ivanova, D. R. & Smith, F. B. Modern Bayesian experimental design. Statist. Sci. 39, 100–114 (2024).
Vanlier, J., Tiemann, C. A., Hilbers, P. A. J. & van Riel, N. A. W. A Bayesian approach to targeted experiment design. Bioinformatics 28, 1136–1142 (2012).
Article CAS PubMed PubMed Central Google Scholar
Patel, A. P. et al. Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science 344, 1396–1401 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Eyler, C. E. et al. Single-cell lineage analysis reveals genetic and epigenetic interplay in glioblastoma drug resistance. Genome Biol. 21, 174 (2020).
Article CAS PubMed PubMed Central Google Scholar
Chevrier, S. et al. An immune atlas of clear cell renal cell carcinoma. Cell 169, 736–749.e18 (2017).
Article CAS PubMed PubMed Central Google Scholar
Zhu, C., Preissl, S. & Ren, B. Single-cell multimodal omics: the power of many. Nat. Methods 17, 11–14 (2020).
Article CAS PubMed Google Scholar
Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587.e29 (2021).
Article CAS PubMed PubMed Central Google Scholar
Lotfollahi, M. et al. Mapping single-cell data to reference atlases by transfer learning. Nat. Biotechnol. 40, 121–130 (2022).
Article CAS PubMed Google Scholar
Battich, N. et al. Sequencing metabolically labeled transcripts in single cells reveals mRNA turnover strategies. Science 367, 1151–1156 (2020).
Article ADS CAS PubMed Google Scholar
Cao, J., Zhou, W., Steemers, F., Trapnell, C. & Shendure, J. Sci-fate characterizes the dynamics of gene expression in single cells. Nat. Biotechnol. 38, 980–988 (2020).
Article CAS PubMed PubMed Central Google Scholar
Qiu, Q. et al. Massively parallel and time-resolved RNA sequencing in single cells with scNT-seq. Nat. Methods 17, 991–1001 (2020).
Article CAS PubMed PubMed Central Google Scholar
Qiu, X. et al. Mapping transcriptomic vector fields of single cells. Cell 185, 690–711.e45 (2022).
Article CAS PubMed PubMed Central Google Scholar
Ji, Y., Zhou, Z., Liu, H. & Davuluri, R. V. DNABERT: pre-trained bidirectional encoder representations from transformers model for DNA-language in genome. Bioinformatics 37, 2112–2120 (2021).
Article CAS PubMed PubMed Central Google Scholar
Zhou, Z. et al. DNABERT-2: efficient foundation model and benchmark for multi-species genome. Preprint at https://arxiv.org/abs/2306.15006 (2023).
Han, H. et al. TRRUST v2: an expanded reference database of human and mouse transcriptional regulatory interactions. Nucleic Acids Res. 46, D380–D386 (2018).
Article CAS PubMed Google Scholar
Liu, Z.-P., Wu, C., Miao, H. & Wu, H. RegNetwork: an integrated database of transcriptional and post-transcriptional regulatory networks in human and mouse. Database 2015, bav095 (2015).
Margolin, A. A. et al. ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 7, S7 (2006).
Article PubMed PubMed Central Google Scholar
Langfelder, P. & Horvath, S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9, 559 (2008).
Article PubMed PubMed Central Google Scholar
Badia-I-Mompel, P. et al. Gene regulatory network inference in the era of single-cell multi-omics. Nat. Rev. Genet. 24, 739–754 (2023).
Article CAS PubMed Google Scholar
Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).
Article CAS PubMed PubMed Central Google Scholar
Qin, Q. et al. Lisa: inferring transcriptional regulators through integrative modeling of public chromatin accessibility and ChIP-seq data. Genome Biol. 21, 32 (2020).
Article PubMed PubMed Central Google Scholar
Kim, S. & Wysocka, J. Deciphering the multi-scale, quantitative cis-regulatory code. Mol. Cell 83, 373–392 (2023).
Article CAS PubMed Google Scholar
Kamimoto, K. et al. Dissecting cell identity via network inference and in silico gene perturbation. Nature 614, 742–751 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Bunne, C. et al. Learning single-cell perturbation responses using neural optimal transport. Nat. Methods 20, 1759–1768 (2023).
Article CAS PubMed PubMed Central Google Scholar
Hetzel, L. et al. Predicting cellular responses to novel drug perturbations at a single-cell resolution. In Proc. 36th International Conference on Neural Information Processing Systems 26711–26722 (Curran Associates Inc., 2022).
Joung, J. et al. A transcription factor atlas of directed differentiation. Cell 186, 209–229.e26 (2023).
Article CAS PubMed PubMed Central Google Scholar
Replogle, J. M. et al. Mapping information-rich genotype–phenotype landscapes with genome-scale Perturb-seq. Cell 185, 2559–2575.e28 (2022).
Article CAS PubMed PubMed Central Google Scholar
Dixit, A. et al. Perturb-seq: dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell 167, 1853–1866.e17 (2016).
Article CAS PubMed PubMed Central Google Scholar
Rozenblatt-Rosen, O., Stubbington, M. J. T., Regev, A. & Teichmann, S. A. The Human Cell Atlas: from vision to reality. Nature 550, 451–453 (2017).
Article ADS CAS PubMed Google Scholar
Luo, Y. et al. New developments on the Encyclopedia of DNA Elements (ENCODE) data portal. Nucleic Acids Res. 48, D882–D889 (2020).
Article CAS PubMed Google Scholar
Stunnenberg, H. G., International Human Epigenome Consortium & Hirst, M. The International Human Epigenome Consortium: a blueprint for scientific collaboration and discovery. Cell 167, 1897 (2016).
Article CAS PubMed Google Scholar
Butler, A., Hoffman, P., Smibert, P., Papalexi, E. & Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 36, 411–420 (2018).
Article CAS PubMed PubMed Central Google Scholar
Tabula Muris Consortium. et al. Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris. Nature 562, 367–372 (2018).
Article ADS Google Scholar
Yao, Z. et al. A transcriptomic and epigenomic cell atlas of the mouse primary motor cortex. Nature 598, 103–110 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
CZI Single-Cell Biology Program et al. CZ CELL×GENE Discover: a single-cell data platform for scalable exploration, analysis and modeling of aggregated data. Nucleic Acids Res. 53, D886–D900 (2025).
Chameleon Team. Chameleon: mixed-modal early-fusion foundation models. Preprint at https://arxiv.org/abs/2405.09818 (2024).
Gage, P. A new algorithm for data compression. C Users J. Arch. 12, 23–38 (1994).
Google Scholar
OpenAI et al. GPT-4 technical report. Preprint at https://arxiv.org/abs/2303.08774 (2023).
Barnum, G., Talukder, S. & Yue, Y. On the benefits of early fusion in multimodal representation learning. Preprint at https://arxiv.org/abs/2011.07191 (2020). An investigation into early-fusion strategies in multimodal learning, demonstrating that immediate integration of inputs enhances model performance and robustness.
Liu, Z. et al. Swin Transformer: hierarchical vision transformer using Shifted Windows. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 9992–10002 (IEEE/CVF, 2021).
Fan, H. et al. Multiscale vision transformers. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 6804–6815 (IEEE/CVF, 2021).
Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 4171–4186 (Association for Computational Linguistics, 2019).
Grill, J.-B. et al. Bootstrap your own latent—a new approach to self-supervised learning. In Proc. 34th International Conference on Neural Information Processing Systems 21271–21284 (Curran Associates Inc., 2020).
Chen, T., Kornblith, S., Norouzi, M. & Hinton, G. A simple framework for contrastive learning of visual representations. In Proc. 37th International Conference on Machine Learning Vol. 119 (eds. Iii, H. D. & Singh, A.) 1597–1607 (PMLR, 2020).
Radford, A. et al. Learning transferable visual models from natural language supervision. In Proc. 38th International Conference on Machine Learning Vol. 139 8748–8763 (PMLR, 2021).
AlQuraishi, M. & Sorger, P. K. Differentiable biology: using deep learning for biophysics-based and data-driven modeling of molecular mechanisms. Nat. Methods 18, 1169–1180 (2021).
Article CAS PubMed PubMed Central Google Scholar
Pan, S. et al. Unifying large language models and knowledge graphs: a roadmap. IEEE Trans. Knowl. Data Eng. 36, 3580–3599 (2024).
Harris, M. A. et al. The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res. 32, D258–D261 (2004).
Article ADS CAS PubMed Google Scholar
Fabregat, A. et al. The Reactome Pathway Knowledgebase. Nucleic Acids Res. 46, D649–D655 (2018).
Article CAS PubMed Google Scholar
Grover, A. & Leskovec, J. node2vec: Scalable feature learning for networks. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 855–864 (Association for Computing Machinery, 2016).
Hamilton, W. L., Ying, R. & Leskovec, J. Inductive representation learning on large graphs. In Proc. 31st International Conference on Neural Information Processing Systems 1–19 (Curran Associates Inc., 2017).
Zhao, W. X., Liu, J., Ren, R. & Wen, J.-R. Dense text retrieval based on pretrained language models: a survey. ACM Trans. Inf. Syst. Secur. 42, 1–60 (2024).
Google Scholar
Jeong, J. et al. Multimodal image-text matching improves retrieval-based chest X-ray report generation. In Proc. Machine Learning Research. Medical Imaging with Deep Learning Vol. 227 (eds Oguz, I. et al.) 978–990 (PMLR, 2024).
Luo, R. et al. BioGPT: generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinform. 23, bbac409 (2022).
Article PubMed Google Scholar
Singhal, K. et al. Large language models encode clinical knowledge. Nature 620, 172–180 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Lv, L. et al. ProLLaMA: a protein large language model for multi-task protein language processing. Preprint at https://arxiv.org/abs/2402.16445 (2024).
Debus, C., Piraud, M., Streit, A., Theis, F. & Götz, M. Reporting electricity consumption is essential for sustainable AI. Nat. Mach. Intell. 5, 1176–1178 (2023).
Article Google Scholar
Hu, E. J. et al. LoRA: low-rank adaptation of large language models. Preprint at https://arxiv.org/abs/2106.09685 (2021).
Pfeiffer, J. et al. AdapterHub: a framework for adapting transformers. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations 46–54 (Association for Computational Linguistics, 2020).
Luecken, M. D. et al. Benchmarking atlas-level data integration in single-cell genomics. Nat. Methods 19, 41–50 (2022).
Article CAS PubMed Google Scholar
Meyer, P. & Saez-Rodriguez, J. Advances in systems biology modeling: 10 years of crowdsourcing DREAM challenges. Cell Syst. 12, 636–653 (2021).
Article CAS PubMed Google Scholar
Saez-Rodriguez, J. et al. Crowdsourcing biomedical research: leveraging communities as innovation engines. Nat. Rev. Genet. 17, 470–486 (2016).
Article CAS PubMed PubMed Central Google Scholar
Lance, C. et al. Multimodal single cell data integration challenge: results and lessons learned. In Proc. NeurIPS 2021 Competitions and Demonstrations Track Vol. 176 (eds Kiela, D., Ciccone, M. & Caputo, B.) 162–176 (PMLR, 2022).
Liu, Z. et al. KAN: Kolmogorov–Arnold networks. Preprint at https://arxiv.org/abs/2404.19756 (2024).
Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In Proc. 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (Association for Computational Linguistics, 2020).
Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55, 248 (2022).
Manakul, P., Liusie, A. & Gales, M. J. F. SelfCheckGPT: zero-resource black-box hallucination detection for generative large language models. In Proc. 2023 Conference on Empirical Methods in Natural Language Processing 9004–9017 (Association for Computational Linguistics, 2023).
Yin, Z. et al. Do large language models know what they don’t know? In Proc. Findings of the Association for Computational Linguistics: ACL 2023 8653–8665 (Association for Computational Linguistics, 2023).
Tian, K., Mitchell, E., Yao, H., Manning, C. D. & Finn, C. Fine-tuning language models for factuality. Preprint at https://arxiv.org/abs/2311.08401 (2023).
Bommasani, R. et al. The foundation model transparency index. Preprint at https://arxiv.org/abs/2310.12941 (2023).
Bubeck, S. et al. Sparks of artificial general intelligence: early experiments with GPT−4. Preprint at https://arxiv.org/abs/2303.12712 (2023).
Rood, J. E., Maartens, A., Hupalowska, A., Teichmann, S. A. & Regev, A. Impact of the Human Cell Atlas on medicine. Nat. Med. 28, 2486–2496 (2022).
Article CAS PubMed Google Scholar
Han, X. et al. Construction of a human cell landscape at single-cell level. Nature 581, 303–309 (2020).
Article ADS CAS PubMed Google Scholar
Baysoy, A., Bai, Z., Satija, R. & Fan, R. The technological landscape and applications of single-cell multi-omics. Nat. Rev. Mol. Cell Biol. 24, 695–713 (2023).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

We thank our collaborators across institutions for their support and insightful contributions throughout the development of the manuscript.

Author information

Authors and Affiliations

Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
Haotian Cui & Bo Wang
Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
Haotian Cui & Bo Wang
Peter Munk Cardiac Center, University Health Network, Toronto, Ontario, Canada
Haotian Cui & Bo Wang
Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
Alejandro Tejada-Lapuerta & Fabian J. Theis
School of Computing, Information and Technology, Technical University of Munich, Munich, Germany
Alejandro Tejada-Lapuerta & Fabian J. Theis
School of Computer and Communication Sciences, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland
Maria Brbić
School of Life Sciences, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland
Maria Brbić
Swiss Institute of Bioinformatics, Lausanne, Switzerland
Maria Brbić
Institute for Computational Biomedicine, Heidelberg University, Faculty of Medicine, Heidelberg University Hospital, Heidelberg, Germany
Julio Saez-Rodriguez
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
Julio Saez-Rodriguez
Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
Simona Cristea
Harvard T.H. Chan School of Public Health, Boston, MA, USA
Simona Cristea
Arc Institute, Palo Alto, CA, USA
Hani Goodarzi
Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA, USA
Hani Goodarzi
Wellcome Sanger Institute, Cambridge, UK
Mohammad Lotfollahi
Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
Mohammad Lotfollahi
TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany
Fabian J. Theis
Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Ontario, Canada
Bo Wang

Authors

Haotian Cui
View author publications
You can also search for this author inPubMed Google Scholar
Alejandro Tejada-Lapuerta
View author publications
You can also search for this author inPubMed Google Scholar
Maria Brbić
View author publications
You can also search for this author inPubMed Google Scholar
Julio Saez-Rodriguez
View author publications
You can also search for this author inPubMed Google Scholar
Simona Cristea
View author publications
You can also search for this author inPubMed Google Scholar
Hani Goodarzi
View author publications
You can also search for this author inPubMed Google Scholar
Mohammad Lotfollahi
View author publications
You can also search for this author inPubMed Google Scholar
Fabian J. Theis
View author publications
You can also search for this author inPubMed Google Scholar
Bo Wang
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

H.C., B.W. and F.J.T. conceptualized the study. H.C. led the development of the manuscript and drafted the initial version. H.C., A.T.-L., M.B., J.S.-R., S.C., H.G., M.L. and F.J.T. contributed to refining the concepts and methodology. A.T.-L., M.B. and H.G. provided substantial input on the sections related to single-cell data and opportunities. J.S.-R. and S.C. contributed to discussions on gene function prediction and regulatory network reconstruction. F.J.T. and B.W. supervised the overall research direction and advised on critical revisions. All authors reviewed, edited and approved the final manuscript.

Corresponding authors

Correspondence to Fabian J. Theis or Bo Wang.

Ethics declarations

Competing interests

F.J.T. consults for Immunai, CytoReason, Cellarity, BioTuring and Genbio.AI, and has an ownership interest in Dermagnostix GmbH and Cellarity. B.W. serves as a scientific advisor to Shift Bioscience, Deep Genomics and Vevo Therapeutics, and acts as a consultant for Arsenal Bioscience. H.G. has an ownership interest in Vevo Therapeutics, and is an advisor to Verge Genomics and Deep Forest Biosciences. J.S.-R. reports funding from GSK, Pfizer and Sanofi, and fees and/or honoraria from Travere Therapeutics, Stadapharm, Astex, Owkin, Pfizer, Grunenthal, Moderna and Tempus. M.L. owns interests in Relation Therapeutics and AIVIVO, and is a scientific cofounder and part-time employee at AIVIVO. The other authors declare no competing interests.

Peer review

Peer review information

Nature thanks Marinka Zitnik and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Cui, H., Tejada-Lapuerta, A., Brbić, M. et al. Towards multimodal foundation models in molecular cell biology. Nature 640, 623–633 (2025). https://doi.org/10.1038/s41586-025-08710-y

Download citation

Received: 17 October 2023
Accepted: 29 January 2025
Published: 16 April 2025
Issue Date: 17 April 2025
DOI: https://doi.org/10.1038/s41586-025-08710-y

Towards multimodal foundation models in molecular cell biology

Subjects

Abstract

Access options

Similar content being viewed by others

scGPT: toward building a foundation model for single-cell multi-omics using generative AI

Large-scale foundation model on single-cell transcriptomics

Delineating the effective use of self-supervised learning in single-cell genomics

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Supplementary Information

Rights and permissions

About this article

Cite this article

Search

Quick links

Subjects

Abstract

Access options

Similar content being viewed by others

scGPT: toward building a foundation model for single-cell multi-omics using generative AI

Large-scale foundation model on single-cell transcriptomics

Delineating the effective use of self-supervised learning in single-cell genomics

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links