Model card
What the model is, what it is for, what it is not, how it is measured, and the places it is weak. Written to be checked, not admired.
Research and educational use only. Provenance-1 is not a medical device. It has not been reviewed or cleared by any regulator, is not CLIA or CAP validated, and must not be used to make decisions about a real patient. Every output is an illustrative artifact of a model trained on retrospective public cohorts, not a medical finding. A confident site prediction is not a cancer diagnosis.
Provenance-1 reads one bulk tumor RNA-seq expression profile and estimates the body site the tumor came from, across 25 anatomical sites. It is built to be honest about uncertainty: every call carries a calibrated probability, a candidate set that lets the model abstain when the evidence is ambiguous, and an out-of-distribution check that flags inputs unlike anything it was trained on.
Reflects the model deployed as of June 2026. Numbers trace to the internal model card; see the Validation page for methodology.
Research and education: exploring what a tumor's transcriptome reveals about its tissue of origin, studying calibrated uncertainty, and generating hypotheses on cohorts you already have. It reads expression values only, never identifiable patient data.
Not for clinical, diagnostic, prognostic, or treatment-selection use. It predicts an anatomical site, not a histological diagnosis, stage, or grade. Inputs from other assays (single-cell, microarray, targeted panels), other normalizations, or non-tumor tissue are out of distribution and should not be trusted.
A profile is harmonized to a common reference so different sequencing pipelines are comparable, mapped onto the fixed 3,882-gene panel, and scored by a calibrated ensemble. The raw scores are then turned into a probability you can act on, a conformal candidate set, and a novelty check. The methodology is documented in Docs; the internals are private.
We lead with the number measured on real held-out patients, not the prettiest one. A higher figure exists on a mixed test set that includes easier external samples; we do not quote it as the headline.
A pool of 17,410 retrospective tumor profiles from open genomic cohorts: primarily GDC (including TCGA), augmented with deduplicated pediatric samples from Treehouse, and with independent cohorts from cBioPortal added so the model generalizes across sequencing platforms. All cohorts are public and retrospective, and contain expression values, not protected health information. The exact source breakdown within the pool is not enumerated in a single artifact, and the pool figure should not be read as a single held-out test set.
These are stated with confidence because the validation was deliberately adversarial. Read them as part of the model, not a disclaimer.
On real mesothelioma cases the model scores 0% recall and confidently misroutes them. An earlier per-site figure for Pleura and Mediastinum turned out to reflect one external batch's signature, not the biology, and we corrected it. Treat any Pleura and Mediastinum output as unreliable.
Pleura and Mediastinum, Thymus, Esophagus, Skin, and Eye have very few examples (on the order of seventeen each in the relevant held-out evaluation), so their per-site metrics are statistically fragile. These tissues are close to the entire public universe of their kind, so the data is exhausted at source, and stronger models do not move the number.
A single tumor sequenced on a different pipeline is the hardest case. Rather than guess, the model abstains on most of those and commits only when it is confident. Inputs that skip the harmonization step will be misclassified.
Calibration and conformal coverage are measured on a held-out split from the same distribution. They do not hold under platform or batch shift. The novelty and input-validity checks exist precisely because of this, and are themselves reference-only, not a validated clinical detector.
All evaluation is retrospective on public cohorts. There is no prospective study, no independent clinical-site validation, and no subgroup-equity audit. The training cohorts are not characterized here for demographic balance, and performance on underrepresented groups is unmeasured and may be worse.
Provenance-1 is invite-only during the research phase and granted under a confidentiality agreement. See how it is validated, read the safety page, then apply for access.