## \# Title of Dataset: Covid-19 and Cancer Consortium (CCC19) Breast Cancer and Racial Disparities Outcomes Study ## Provenance for this README - File name: README_Dataset-CCC19Breast_v0.1.0.txt - Authors: Ali Raza Khaki - Other contributors: Gayathri Nagaraj, Dimpy Shah - Date created: 2022-10-14 - Date modified: 2023-02-15 ## Dataset Version and Release History - Current Version: - Number: 1.0.0 - Date: 2023-02-15 - Persistent identifier: DOI: https://doi.org/10.5281/zenodo.7293199 - Summary of changes: n/a - Embargo Provenance: n/a - Scope of embargo: n/a - Embargo period: n/a ## Dataset Attribution and Usage - Dataset Title: Data for the article "Clinical Characteristics, Racial Inequities, and Outcomes in Patients with Breast Cancer and COVID-19: A COVID-19 and Cancer Consortium (CCC19) Study" - Persistent Identifier: https://doi.org/10.5281/zenodo.7293199 - Dataset Contributors: - Creators: Gayathri Nagaraj, Ali Raza Khaki, Dimpy Shah - Date of Issue: 2023-02-15 - Publisher: Covid-19 and Cancer Consortium (CCC19) ## Contact Information - Name: Ali Raza Khaki - Affiliations: Division of Oncology, Department of Medicine, Stanford University School of Medicine; - ORCID ID: https://orcid.org/0000-0002-4655-4426 - Email: alikhaki@stanford.edu - Alternate Email: ali@khaki.org - Address: e-mail preferred - Alternative Contact: Co-Author - Name: Gayathri Nagaraj - Affiliations: Division of Medical Oncology and Hematology, Loma Linda University School of Medicine - Email: GNagaraj@llu.edu - Alternative Contact: Co-Author - Name: Dimpy Shah - Affiliations: Population Health Sciences, Mays Cancer Center at UTHealth San Antonio MD Anderson - Email: ShahDP@uthscsa.edu ## Sharing/access Information This dataset is not available online at any other source. The COVID-19 and Cancer Project is a collaborative project investigating COVID19 outcomes among patients with cancer. See more about the project at www.ccc19.org ------------------------------------------------------------------------ # Additional Dataset Metadata ## Dates and Locations - Dates of data collection: Data collected between March 2020 and June 2021 - Geographic locations of data collection: Participating sites in CCC19 in the United States of America ------------------------------------------------------------------------ # Methodological Information - Methods of data collection/generation: see manuscript for details ------------------------------------------------------------------------ # Data and File Overview ## Summary Metrics - File count: 9 - Total file size: 148 KB - File formats: .csv # Description of the Data and file structure ## DATA - 48 - breast cancer data.RData: data set used for the CCC19 Breast Project analysis - 48 - Breast cancer - Approved project variables - 2-5-22.pdf: File describing variables in dataset ## CODE - CCC19_48_PatientSelection.2.8.22: Cohort selection - CCC19_48_DataMgt.2.8.2: Prepare data for descriptive analyses (Table 1 and 2) - CCC19_48_PrimaryAnalysis.Female.11.11.21: Descriptive analyses among female (Table 1 and 2) - CCC19_48_Tab1_addAll_11.11.21: Descriptive analyses among female, adding the total column (Table 1 and 2) - CCC19_48_PreImpute_Mgt.8.29.21: Prepare data for imputation - 48_Imputation_8.29.21: Imputation (for Table 3 and 3a) - CCC19_48_OddsRatio 8.29.21: Regression analyses and E-value (Table 3 and 3a) - 48_funs.impute.ord.4.30.21: Functions to combine results from multiple imputation (for OddsRatio file) ------------------------------------------------------------------------ # File/Folder Details ## Details for: 48 - breast cancer data.RData - Description: an RData file containing the data from CCC19 database used in the study. - Format(s): .RData - Size(s): 26.7 KB - Dimensions: 1383 rows x 54 columns - Variables (see also "48 - Breast cancer - Approved project variables - 2-5-22.pdf"): - der_age_trunc: Age with imputation for categoricals (Years, continuous 18-89; those \>89 set to 90) - der_AKI_comp: Acute kidney injury (0=No; 1=Yes; 99=Unknown)\ - der_any_cyto: Any cytotoxic cancer treatment in the 3 months prior to COVID-19 (0=No; 1=Yes; 99=Unknown) - der_any_endo: Any endocrine therapy in the 3 months prior to COVID-19 (0=No; 1=Yes; 99=Unknown) - der_any_immuno: Any immunotherpay in the 3 months prior to COVID-19 (0=No; 1=Yes; 99=Unknown) - der_any_local: Any local therapy (surgery or RT) within 3 months of COVID-19 (0=No; 1=Yes; 99=Unknown) - der_any_other: Any other cancer therapy in the 3 months prior to COVID-19 (0=No; 1=Yes; 99=Unknown) - der_any_targeted: Any targeted therapy in the 3 months prior to COVID-19 (0=No; 1=Yes; 99=Unknown) - der_bleeding_comp: Bleeding complication with COVID-19 (0=No; 1=Yes; 99=Unknown) - der_breast_biomarkers: Breast cancer biomarkers (1=ER+; 2=ER+/HER2+; 3=HER2+; 4=triple negative; 99=Unknown) - der_cancer_status_v4: Derived variable indicating cancer status (0=Remission/NED, remote; 1=Remission/NED, recent; 2=Active, responding; 3=Active, stable; 4=Active, progressing; 99=Unknown) - der_cancer_tx_timing_v2: Timing of cancer treatment relative to COVID-19 (0=more than 3 months; 1=0-4 weeks; 2=1-3 months (\*); 88=never or after COVID-19 diagnosis; 99=unknown) - der_cancertr_none: No cancer treatment in the 3 months prior to COVID-19 (0=No; 1=Yes; 99=Unknown) - der_card: Cardiovascular comorbidity (CAD, CHF, Afib, arrhythmia NOS, PVD, CVA, cardiac disease NOS); (0=No; 1=Yes; 99=Unknown) - der_cdk46i_3m: Any targeted therapy includes a CDK4/6 inhibitor therapy in the 3 months prior to COVID-19 (0=No; 1=Yes) - der_coinfection_any: Any co-infection within +/- 2 weeks of COVID-19 dx (0=No; 1=Yes; 99=Unknown) - der_CV_event_v2: Derived (any) cardiovascular complication variable (0=No; 1=Yes; 99=Unknown) - der_days_fu: Follow-up in days, with some estimation for intervals (days) - der_dead30: Derived variable indicating whether patient has died within 30 days of COVID-19 diagnosis (0=No; 1=Yes; 99=Unknown) - der_deadbinary: Derived dead/alive variable (0=No; 1=Yes; 99=Unknown) - der_dm2: Derived variable indicating whether patient has diabetes mellitus (0=No; 1=Yes; 99=Unknown) - der_ecogcat2: Performance status (0= ECOG 0; 1=ECOG 1; 2=ECOG 2+) - der_GI_event: Derived gastrointestinal complication variable (0=No; 1=Yes; 99=Unknown) - der_hcq: Hydroxychloroquine as COVID-19 treatment ever (0=No; 1=Yes; 99=Unknown) - der_heme: Hematologic malignancy indicator (0=No; 1=Yes) - der_her2_3m: Any targeted therapy includes an anti-HER2 therapy in the 3 months prior to COVID-19 (0=No; 1=Yes) - der_hosp: Derived hospitalized/not hospitalized variable (0=No; 1=Yes; 99=Unknown) - der_ICU: Derived variable indicating any time in ICU (0=No; 1=Yes; 99=Unknown) - der_insurance: Insurance type (Medicaid alone; Medicate alone; Medicare/Medicaid +/- other; Other government +/- other; Private +/- other; Uninsured; Unknown) - der_met_bone: Metastatic breast cancer to bone (0=No; 1=Yes; 99=Unknown) - der_met_liver: Metastatic breast cancer to liver (0=No; 1=Yes; 99=Unknown) - der_met_lung_v2: Metastatic breast cancer to lung (0=No; 1=Yes; 99=Unknown) - der_metastatic: Metastatic breast cancer status (0=No; 1=Yes; 99=Unknown) - der_MOF_comp: Multisystem organ failure (0=No; 1=Yes; 99=Unknown) - der_mv: Derived variable indicating whether patients required mechanical ventilation (0=No; 1=Yes; 99=Unknown) - der_o2_ever: Indicates whether patient has ever required supplemental oxygen (0=No; 1=Yes; 99=Unknown) - der_obesity: Binary obesity (BMI\>=30 or comorbidity of obesity) (0=No; 1=Yes; 99=Unknown) - der_ordinal_v1a: CCC19 ordinal outcome with death at any time (0 = not hospitalized; 1 = hospitalized; 2 = ICU; 3 = mechanical ventilation; 4 = death at any time) - der_other_3m: Any other targeted therapy (Not anti-HER2 / CDK4/6 inhibitor) in the 3 months prior to COVID-19 (0=No; 1=Yes) - der_other_tx_c19_v2: COVID-19 treatments other than HCQ, steroids or remdesevir (0=No; 1=Yes; 99=Unknown) - der_pulm: Derived variable indicating whether patient has pulmonary comorbidities (0=No; 1=Yes; 99=Unknown) - der_pulm_event: Derived pulmonary complication variable (0=No; 1=Yes; 99=Unknown) - der_race_v2: Race/ethnicity (Hispanic; Non-Hispanic AAPI; Non-Hispanic Black; Non-Hispanic White; Other) - der_region_v2: Region of patient residence with ex-US collapsed (Non-US; Other; Undesignated US; US Midwest; US Northeast; US South; US West) - der_rem: Remdesevir as treatment for COVID-19 ever (0=No; 1=Yes; 99=Unknown) - der_renal: Renal comorbidities (0=No; 1=Yes; 99=Unknown) - der_sepsis_comp: Sepsis complication (0=No; 1=Yes; 99=Unknown) - der_site_type: Type of healthcare center providing the patient's data (AMC = academic medical center; CP = community practice; TCC = tertiary care center) - der_smoking2: Derived smoking status (Never; Current or Former; Unknown) - der_steroids_c19: Steroids as COVID-19 treatment ever (0=No; 1=Yes; 99=Unknown) - der_tr_intent: Derived treatment intent (Unknown Treatment; Not on Treatment; Palliative; Curative; Missing) - der_txline: Most recent line of cancer treatment, including systemic and non-systemic therapies (Untreated in last 12 months; Curative NOS; First line; Non-curative NOS; Other; Second line or greater; Unknown) - severity_of_covid_19_v2: Initial severity and course of COVID-19 (1=Mild (no hospitalization required); 2=Moderate (hospitalization indicated); 3=Severe (ICU admission indicated); 99=Unknown) - urban_rural: What type of area dose the patient primarily reside in (1= Urban (city); 2=Suburban (town, suburbs); 3=Rural (country); 88=Other; 99=Unknown) ------------------------------------------------------------------------ END OF README