Whole-brain microscopy datasets are enormous — often terabytes per image — and the field has been generating thousands of them. The bottleneck is no longer acquisition; it’s storage, sharing, and making all of that data usable by researchers who weren’t part of the original experiment.
The Brain Image Library (BIL) was built to solve that problem.
What BIL Provides
BIL is a public, persistent repository for brain microscopy data, hosted at the Pittsburgh Supercomputing Center and serving the broader neuroscience community. Rather than requiring researchers to download multi-terabyte datasets before they can work with them, BIL provides integrated analysis and visualization tools that let users explore data in place — directly through the repository.
Key capabilities:
- Centralized, persistent storage for whole-brain imaging datasets at micron scale
- In-browser visualization without requiring local downloads
- Integrated analysis tools accessible alongside the data
- Community contribution model — datasets deposited by labs worldwide are findable and reusable by anyone
The Scale of the Problem
Efforts like the BRAIN Initiative Cell Census Network (BICCN) have produced thousands of whole-brain imaging datasets aimed at tracing neuronal circuitry and classifying cell types. Each dataset is a significant computational object. Historically, the size and heterogeneity of this data made broad sharing impractical — the infrastructure simply didn’t exist to host it at community scale.
BIL is that infrastructure.
My Contribution
I contributed to the development and data engineering work that supports BIL through my work at PSC. Getting large-scale scientific data repositories to actually work — reliably, at scale, for a diverse user community — requires as much engineering effort as any research project. This paper documents that work and makes the case for community-contributed repositories as a model for how big neuroscience data should be managed.