HuBMAP Data Portal: A Resource for Multi-Modal Spatial and Single-Cell Data of Healthy Human Tissues

The HuBMAP Data Portal is the public face of the Human BioMolecular Atlas Program — the place where the data actually lands and where the broader research community can access it. This preprint describes the portal’s architecture, capabilities, and current scale.

As of October 2025, the portal holds 5,032 datasets spanning 22 data types across 27 organ classes from 310 donors. That’s not a static archive: it’s a queryable, visualizable, analysis-ready resource.

What the Portal Does

The portal goes well beyond file storage. Key capabilities:

Integrated Jupyter workspaces — run analysis directly against portal data without downloading it
Interactive visualization for over 1,500 datasets
Standardized processing pipelines — all data processed through consistent workflows so results are comparable across labs and technologies
Metadata-driven search — find datasets by organ, donor demographics, assay type, or molecular target
Community contributions — datasets from external labs can be deposited and made publicly available
Bulk download for large-scale computational studies

What Makes This Hard

Building a data portal for a program like HuBMAP is not a straightforward engineering problem. The data is heterogeneous — 22 distinct data types, from bulk RNA-seq to multiplexed imaging to spatial transcriptomics — and the scale is significant. Ensuring that datasets deposited by different labs, using different instruments and protocols, are processed consistently and remain interoperable requires continuous pipeline development and infrastructure maintenance.

My work at the Pittsburgh Supercomputing Center contributes directly to that infrastructure. PSC is a core node of the HuBMAP consortium, and the computational resources and engineering that PSC provides underpin the portal’s ability to store, process, and serve data at this scale.

icaoberg / HuBMAP Data Portal: A Resource for Multi-Modal Spatial and Single-Cell Data of Healthy Human Tissues

What the Portal Does

What Makes This Hard