This dataset is currently in-development so check back soon for updates!
The sample of the Emory Prostate Imaging Project (also known as EPIP) contains 844 prostate MRI exams for 840 patients over the period of 1.5 years.
A full discussion on screening prostate MRI is beyond the scope of this documentation, however we will provide a brief primer. Prostate cancer is uncontrolled growth of cells in the prostate gland. It is the most common non-skin cancer in men in the United States, and the 2nd most common cancer in men worldwide. Individualized strategies for screening and management of localized prostate cancer are needed to prevent progression while also avoiding overtreatment.
In the United States, males are recommended Prostate-Specific Antigen (PSA) testing every 2 to 4 years between the ages of 50 and 69, with individualized strategies for screening for high-risk patients. Decisions on how to move forward with testing, imaging, biopsy and treatment are made in conjunction with clinical decision making.
Following a screening discussion which may include discussions about family history of prostate cancer, symmptoms, current medications, and other risk factors, as well as a digital rectal exam (DRE), a patient may be recommended for PSA testing. A patient with an elevated PSA result, returns for retesting of PSA levels after 4-8 weeks. If levels remain elevated, a patient will proceed with Magnetic Resonance Imaging (MRI). If a suspicious lesion is identified (often PI-RADS 3-5) in the MRI, often, a biopsy is performed. The biopsy may be guided by ultrasound, MRI or a fusion of both methods. The biopsy results categorize the lesions as benign or grade them by how aggressive the prostate cancer cells appear under a microscope (International Society of Urological Pathology Grade). After cancer is confirmed and graded, PSMA Pet can be used to for prostate cancer staging.
Note: PSMA PET may be used as complemetary imaging when MRI does not give a clear result.
Images in the dataset are structured as /patient_ID/study_ID/series_ID/DICOM
EPIP consists of one image modality and six primary tables that are used to assign clinical factors and image metadata.
Exam contains all clinical data and demographics data collected for an exam. Each row will contain information regarding one exam.
| Feature name | Type | Description |
|---|---|---|
empi_anon |
string | Unique anonymized patient ID, all exams for a patient will have the same ID |
accession_num |
string | Unique accession number of imaging exam |
pat_age_at_exam |
integer | Patient age at exam time |
race_desc |
string | patient race |
ethnic_group_desc |
string | patient ethnicity |
begin_exam_dttm |
datetime | Exam start timestamp |
proc_name |
string | Name of imaging procedure |
ord_clin_ind |
string | Clinical indication text from order |
first_dx |
string | First prostate cancer diagnosis noted in EMR |
diagnosis |
string | Primary diagnosis for this exam |
region |
string | Imaging location or region |
has_cancer_during_exam |
boolean | Cancer present/confirmed during exam |
prior_psma_date |
date | prior PSMA imaging date |
prior_psma_proc_name |
string | prior PSMA imaging procedure name |
prior_psma_impression |
string | prior PSMA report impression |
following_psma_date |
date | next PSMA imaging date |
following_psma_proc_name |
string | next PSMA imaging procedure name |
following_psma_impression |
string | next PSMA imaging report impression |
volume |
float | Prostate gland volume (cc) |
psa_density |
float | Prostate-specific antigen level |
clinical_indication_score |
string | Clinical indication score |
history |
string | Clinical history |
extracted_lesion_no |
integer | number of lesions detected |
Lesion contains all lesion-level data collected during an exam. In this file, each row represents a single finding, so there can be several rows per exam. PI-RADS scores are assigned on a per-lesion basis. These data are entered into a report by the radiologist at the time of interpretation, and extracted in a structured format by the HITI lab.
| Feature name | Type | Description |
|---|---|---|
pirads |
integer | PI-RADS score |
lesion_contents |
string | Lesion description |
extracted_lesion_size |
float | Lesion size (cm) |
extracted_lesion_volume |
float | Lesion volume (cc) |
extracted_location_base |
boolean | Base level presence |
extracted_location_midgland |
boolean | Midgland level presence |
extracted_location_apex |
boolean | Apex level presence |
extracted_later |
string | Laterality |
extracted_PZ |
boolean | Lesion zone (peripheral) |
extracted_TZ |
boolean | Lesion zone (transition) |
extracted_T2 |
string | T2-specific qualitative features |
extracted_DWI |
string | DWI-specific qualitative features |
extracted_ADC |
string | ADC-specific qualitative features |
extracted_DCE |
string | DCE-specific qualitative features |
max_extracted_lesion_size |
float | Largest lesion dimension (cm) |
Pathology contains pathology and biopsy results.
| Feature name | Type | Description |
|---|---|---|
specimen_collect |
date | date specimen collected |
path_res |
string | pathology result (core biopsy) |
gleason_res |
string | Gleason score from pathology |
gleason_res_surgery |
string | Gleason from surgical pathology |
gleason_res_group |
integer | ISUP grade group |
benign_path_res |
boolean | Benign status indicator |
diagnosis_after |
string | diagnosis after pathology integration |
Staging_spread contains information on the local tumor staging and invasion.
Treatment contains surgical and therapy information.
Contains image-level information and is structured as one row per file. Information includes DICOM metadata, series description, and file path.
| Feature name | Description |
|---|---|
empi_anon |
Unique anonymized patiend ID. All exams for a patient will have the same empi_anon |
acc_anon |
Unique ID per exam. All rows for an exam will have the same acc_anon (and the same empi_anon) |
anon_dicom_path |
Anonymized file path |
study_date_anon |
Anonymized date of aquisition if the exam |
StudyDescription |
Name of the procedure |
SeriesDescription |
Name of the series for an exam. This can vary depending on view position. |
View Position |
Type of view acquired (Axial, Sagittal or Coronal) |