Task Objective¶
This task focuses on predicting Bacillus Calmette-Guérin(BCG) response subtypes (BRS1, BRS2, BRS3) from H&E-stained histopathology slides of high-risk non-muscle-invasive bladder cancer (HR-NMIBC) patients. These subtypes are defined using a validated biomarker signature [1] and are associated with response to BCG therapy. Final evaluation will focus on binary classification:
BRS3 vs. BRS1/2, as this is clinically relevant for treatment decision-making.
Here is a schematic illustrates a multi-modal deep learning pipeline.
Evaluation Metric¶
The F1 score will be used to evaluate the model's classification accuracy at the image level based on molecular analysis, combining precision and recall into a single metric. Additionally, the AUROC is used to assess the global discrimination power of the model across various thresholds. Together, these metrics provide a robust evaluation framework for classification tasks involving biomarker response subtypes (BRS). The pipeline will be openly available.
Data Details¶
Training Data¶
• 🧠 Histopathology: A single H&E-stained whole slide image (WSI) per patient, with 0.25 µm/pixel resolution at its highest resolution. Note that this WSI is either of an adjacent section of the H&E slide used for bulk RNA-seq, the same H&E slide with a punched cavity on the tissue section, or an H&E slide of another tumor of the same patient.
• 🧠 Histopathology: Binary tissue mask outlining the tissue section
• 📋 Clinical Data: Containing clinical history, tumor characteristics, treatment details and BRS subtype labels derived from bulk RNA-sequencing using a validated gene expression signature [1]
Feature | Type / Values | Description |
---|---|---|
age | Integer (years) | Age of the patient in years |
sex | Male / Female | Biological sex of the patient |
smoking | Yes / No | Smoking history |
tumor | Primary / Recurrence | Indicates whether the tumor is primary or recurrent |
stage | TaHG / T1HG / T2HG | Tumor stage: Ta (inner lining), T1 (connective tissue), T2 (muscle invasion); all high-grade |
substage | T1m / T1e | T1m: ≤ 0.5mm invasion; T1e: > 0.5mm invasion |
grade | G2 / G3 | G2: moderately differentiated; G3: poorly differentiated |
reTUR | Yes / No | Re-transurethral resection (TUR) performed before BCG induction |
LVI | Yes / No | Lymphovascular invasion observed on H&E slide |
variant | UCC / UCC + Variant | Urothelial carcinoma alone or with variant histology |
EORTC | High risk / Highest risk | European Organization for Research and Treatment of Cancer (EORTC) risk classification |
no_instillations | Integer | Total number of BCG instillations. "-1" indicates missing data. |
Reference Standard | ||
BRS | BRS1 / BRS2 / BRS3 | Biomarker-derived BCG response subtype from RNA-seq |
Additional information (not used in evaluation/test) | ||
progression | 0 / 1 | Progression to advanced disease (1-true/0-false) |
HG_recur_BCG_failure | 0 / 1 | BCG failure (1-true/0-false) |
time_to_prog_or_FUend | Float (months) | Time to progression or end of follow-up in months |
time_to_HG_recur_or_FUend | Float (months) | Time to high-grade recurrence or end of follow-up in months |
time_to_FUend | Float (months) | Time to end of follow-up in months |
Data versions¶
v1¶
- 132 paired multimodal training data (
_HE.tif
,_HE_mask.tif
, and_CD.json
) - Contains incorrect histopathology slides and/or tissue mask (
_HE.tif
,_HE_mask.tif
):- Corrupted:
2A_024
, - Incorrect spacings:
2A_017, 2A_031, 2A_042, 2A_050, 2A_141, 2A_143, and 2A_157
- Scan is out of focus, resulting in failed tissue segmentation: 2A_025 (blank mask)
- Corrupted:
v2¶
- 182 paired multimodal training data (
_HE.tif
,_HE_mask.tif
, and_CD.json
). Note that all the clinical data (_CD.json
) files have been updated to reflect the features used in validation and test. - These slides are fixed:
2A_024, 2A_017, 2A_031, 2A_042, 2A_050, 2A_141, 2A_143, and 2A_157
.2A_025
slide which is out of focus, however, cannot be fixed. - Added extra materials to support model training:
Download Training Data¶
Instruction (latest version)¶
- Install AWS CLI https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html
- Bucket name:
s3://chimera-challenge/v2/task2/
- Command line:
aws s3 cp --no-sign-request s3://chimera-challenge/v2/task2/ <destination_path>
Bucket structure (latest version)¶
v2/ task2/ data/ task2_quality_control.csv {patient_id}/ {patient_id_CD.json} {patient_id_HE.tif} {patient_id_HE_mask.tif} features/ coordinates/ {patient_id_HE.npy} {patient_id_HE.npy} {patient_id_HE.npy} features/ {patient_id_HE.pt} {patient_id_HE.pt} {patient_id_HE.pt}