GeneVault sources the research cohorts others can't, then turns multimodal genomic, clinical, imaging, and biospecimen-linked data into discovery-ready assets through custom cohort curation today and the Global BioIntelligence Platform in development.
The entire rest of the world is just 4, down from 11 only five years ago, and still falling.
Everything GeneVault delivers draws on one growing data network that combines large, diverse population cohorts with specialist disease-specific datasets, the shared data source behind both our custom cohort curation and our AI models. It is the asset a competitor cannot simply buy, and it is what lets us source the right data for almost any study, wherever that data lives.
Large, diverse, multimodal cohorts across Africa, Asia, the Middle East, and Latin America, regions that hold some of the world's most genetically diverse yet least-studied populations.
Rare-disease and therapeutic-area-specific datasets curated wherever they exist, including uniquely sourced cohorts from the US and Europe, so a study can reach the exact population, condition, or modality it needs.
Whole-genome and whole-exome sequencing across diverse ancestries.
Proteomics, transcriptomics, and further omics layers in phased rollout.
Linked electronic health records, treatment timelines, and outcomes over time.
Whole-slide images, radiology, and ophthalmic imaging with matched metadata.
Sample-backed cohorts that enable prospective assays where a study needs them.
Rich clinical phenotyping to power stratification and target validation.
GeneVault is part of the Count Us In™ movement, working to bring under-represented data out of silos and into global science. Learn more →
GeneVault designs and delivers custom research cohorts for biopharma, biotech, AI, and academic teams. Drawing on our Global BioIntelligence network, we build each cohort to your protocol's specific inclusion, modality, and endpoint requirements, across populations, diseases, and therapeutic areas, from large, diverse cohorts in underrepresented regions to specialist datasets uniquely sourced in the US and Europe. Where a programme requires data that does not yet exist, we can commission new sample collection and prospective assays through the network.
Representative examples of the work we do
The Global BioIntelligence™ Platform is the AI built on the Network: it extends GeneVault's ability to transform representative biology into discovery-ready intelligence. Powered by a foundation model trained across diverse populations, diseases, and multimodal datasets, it is being designed to uncover biological insights, validate discoveries, and help researchers build a more complete understanding of human health.
The model is trained across the network using federated learning. Institutions keep stewardship of their data, and only model gradients move between sites. This is how GeneVault builds representative intelligence without asking communities to give up control of their data.
When biomedical data reflects only a narrow slice of humanity, and only the conditions that slice happens to have, therapies and diagnostics fail to generalise. GeneVault sources for representation across every axis that matters.
Diverse genetic backgrounds where most reference data reflects a single dominant population.
Balanced cohorts in areas like cardiometabolic and autoimmune disease, where trials have historically skewed.
Hard-to-reach conditions, from lysosomal storage disorders to inherited kidney disease, too often left out of large datasets.
Deep, condition-specific cohorts spanning oncology subtypes, immunology, and beyond.
Models built on narrow data do not generalise. Representative data fixes that at the source.
Diverse genomes and under-studied conditions carry unique signals, protective variants, and novel targets that remain invisible.
Diversity, governance, and community participation compound into a true discovery platform, not just data assets.
Aligned with leading data standards, and compliant by design
GeneVault® transforms representative biology into discovery-ready assets through custom cohort curation, AI-powered discovery, and global data partnerships. Powered by a growing network spanning 54M+ patient lives across 50+ countries, Global BioIntelligence helps bring more of the world's biology into biomedical discovery.
Cambridge Innovation Center
1 Broadway, Kendall Square
Cambridge, MA 02142