Task Objective

This task focuses on predicting Bacillus Calmette-Guérin(BCG) response subtypes (BRS1, BRS2, BRS3) from H&E-stained histopathology slides of high-risk non-muscle-invasive bladder cancer (HR-NMIBC) patients. These subtypes are defined using a validated biomarker signature [1] and are associated with response to BCG therapy. Final evaluation will focus on binary classification: BRS3 vs. BRS1/2, as this is clinically relevant for treatment decision-making. Here is a schematic illustrates a multi-modal deep learning pipeline.


Evaluation Metric

The F1 score will be used to evaluate the model's classification accuracy at the image level based on molecular analysis, combining precision and recall into a single metric. Additionally, the AUROC is used to assess the global discrimination power of the model across various thresholds. Together, these metrics provide a robust evaluation framework for classification tasks involving biomarker response subtypes (BRS). The pipeline will be openly available.



Data Details

Training Data

• 🧠 Histopathology: A single H&E-stained whole slide image (WSI) per patient, with 0.25 µm/pixel resolution at its highest resolution. Note that this WSI is either of an adjacent section of the H&E slide used for bulk RNA-seq, the same H&E slide with a punched cavity on the tissue section, or an H&E slide of another tumor of the same patient.

• 🧠 Histopathology: Binary tissue mask outlining the tissue section

• 📋 Clinical Data: Containing clinical history, tumor characteristics, treatment details and BRS subtype labels derived from bulk RNA-sequencing using a validated gene expression signature [1]

Feature Type / Values Description
age Integer (years) Age of the patient in years
sex Male / Female Biological sex of the patient
smoking Yes / No Smoking history
tumor Primary / Recurrence Indicates whether the tumor is primary or recurrent
stage TaHG / T1HG / T2HG Tumor stage: Ta (inner lining), T1 (connective tissue), T2 (muscle invasion); all high-grade
substage T1m / T1e T1m: ≤ 0.5mm invasion; T1e: > 0.5mm invasion
grade G2 / G3 G2: moderately differentiated; G3: poorly differentiated
reTUR Yes / No Re-transurethral resection (TUR) performed before BCG induction
LVI Yes / No Lymphovascular invasion observed on H&E slide
variant UCC / UCC + Variant Urothelial carcinoma alone or with variant histology
EORTC High risk / Highest risk European Organization for Research and Treatment of Cancer (EORTC) risk classification
no_instillations Integer Total number of BCG instillations. "-1" indicates missing data.
Reference Standard
BRS BRS1 / BRS2 / BRS3 Biomarker-derived BCG response subtype from RNA-seq
Additional information (not used in evaluation/test)
progression 0 / 1 Progression to advanced disease (1-true/0-false)
HG_recur_BCG_failure 0 / 1 BCG failure (1-true/0-false)
time_to_prog_or_FUend Float (months) Time to progression or end of follow-up in months
time_to_HG_recur_or_FUend Float (months) Time to high-grade recurrence or end of follow-up in months
time_to_FUend Float (months) Time to end of follow-up in months

Data versions

v1
  • 132 paired multimodal training data (_HE.tif, _HE_mask.tif, and _CD.json)
  • Contains incorrect histopathology slides and/or tissue mask (_HE.tif, _HE_mask.tif):
    • Corrupted: 2A_024,
    • Incorrect spacings: 2A_017, 2A_031, 2A_042, 2A_050, 2A_141, 2A_143, and 2A_157
    • Scan is out of focus, resulting in failed tissue segmentation: 2A_025 (blank mask)
v2
  • 182 paired multimodal training data (_HE.tif, _HE_mask.tif, and _CD.json). Note that all the clinical data (_CD.json) files have been updated to reflect the features used in validation and test.
  • These slides are fixed: 2A_024, 2A_017, 2A_031, 2A_042, 2A_050, 2A_141, 2A_143, and 2A_157. 2A_025 slide which is out of focus, however, cannot be fixed.
  • Added extra materials to support model training:
    • Feature embeddings (*.pt) and
    • Coordinate of the patches (*.npy), where both are
      • extracted on the _HE.tif and _HE_mask.tif of each case,
      • using UNI at 0.25mpp resolution with 224×224 patch size, and
      • using slide2vec


Download Training Data

Instruction (latest version)

  1. Install AWS CLI https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html
  2. Bucket name: s3://chimera-challenge/v2/task2/
  3. Command line: aws s3 cp --no-sign-request s3://chimera-challenge/v2/task2/ <destination_path>

Bucket structure (latest version)

v2/
  task2/
    data/ 
      task2_quality_control.csv 
      {patient_id}/
        {patient_id_CD.json}
        {patient_id_HE.tif}
        {patient_id_HE_mask.tif}
    features/
      coordinates/
        {patient_id_HE.npy}
        {patient_id_HE.npy}
        {patient_id_HE.npy}
      features/
        {patient_id_HE.pt}
        {patient_id_HE.pt}
        {patient_id_HE.pt}


Reference

  1. de Jong FC, Laajala TD, Hoedemaeker RF, Jordan KR, van der Made AC, Boevé ER, van der Schoot DK, Nieuwkamer B, Janssen EA, Mahmoudi T, Boormans JL. Non–muscle-invasive bladder cancer molecular subtypes predict differential response to intravesical Bacillus Calmette-Guérin. Science translational medicine. 2023 May 24;15(697):eabn4118.