This DATSETNAMEreadme.txt file was generated on 2020-07-03 by MARIAN-GABRIEL HANCEAN GENERAL INFORMATION 1. Title of Dataset: "Data from: Early spread of COVID-19 in Romania" 2. Author Information A. Principal Investigator Contact Information Name: Hâncean, Marian-Gabriel Institution: Department of Sociology, University of Bucharest Address: 90-92 Panduri St., 050663, Bucharest, Romania Email: gabriel.hancean@sas.unibuc.ro B. Associate or Co-investigator Contact Information Name: Perc, Matjaž Institution: Faculty of Natural Sciences and Mathematics, University of Maribor Address: Koroška cesta 160, 2000 Maribor, Slovenia Email: matjaz.perc@gmail.com C. Alternate Contact Information Name: Lerner, Jürgen Institution: Department of Computer and Information Science, University of Konstanz Address: Universitaetsstrasse 10, 78464, Konstanz, Germany Email: juergen.lerner@gmail.com 3. Date of data collection (approximate date): 2020-03-01 and 2020-05-02 4. Geographic location of data collection: Bucharest, Romania SHARING/ACCESS INFORMATION Recommended citation for this dataset: Hâncean M-G, Perc M, Lerner J. Data from: Early spread of COVID-19 in Romania [Internet]. Dryad Digital Repository; 2020. DATA & FILE OVERVIEW 1. File List: A) "COVID 19 ROMANIA variable description.xlsx" provides a detailed description for all the analyzed variables. File included: - "COVID_19_Romania_variable_descr" sheet B) "COVID 19 ROMANIA anonymized data.xlsx" includes the raw data used into the analysis. Files included: - "147_covid19_cases_database" sheet contains information of the first 147 confirmed COVID-19 cases in Romania. - "147_covid19_cases_metadata" sheet presents the variables used to describe the first 147 COVID-19 cases and the data anonymization methods. - "147_covid19_cases_data_sources" sheet presents links to public and open data sources. - "covid19_networks_attr_database" sheet contains attribute data corresponding to the human-to-human transmission networks. - "covid19_networks_attr_metadata" sheet presents the variables used to describe the human-to-human transmission networks. - "covid19_networks_ties_dataset" sheet contains network data about the human-to-human transmission networks. - "covid19_networks_ties_metadata" sheet includes a methodological note on how human-to-human transmission networks were built. 2. Relationship between files, if important: Readers are encouraged to firstly consult the list of variabiles (file A: "COVID 19 ROMANIA variable description.xlsx") and afterwards consult the dataset (available as file B: "COVID 19 ROMANIA anonymized data.xlsx"). Indirect identifiers were masked by numerical codes. 3. Additional related data collected that was not included in the current data package: Not the case 4. Are there multiple versions of the dataset? No METHODOLOGICAL INFORMATION 1. Description of methods used for collection/generation of data: The dataset was collected from the Romanian Ministry of Health communiqués. Data from official statements were supplemented with information reported by Romanian local media. This strategy was deemed to improve the quality and accuracy of the information communicated by the Romanian officials. Online public sources can be subsequently accessed for further details on the early break of COVID-19 in Romania. The level of data granularity prevents any form of disclosing and tracking the infected persons. The first COVID-19 confirmed case in the dataset is on February 22, 2020, while the last one is on April 2, 2020. We employed the following case selection method: firstly, we started by selecting for each Romanian county, the first patients (index cases). Afterwards, we continued by selecting all publicly available individual cases officially reported on the territory of Romania. When the official Romanian authorities restricted public access to COVID-19 infected patients, we stopped the data collection procedure. Human-to-human transmission networks were built by scanning, in the available official data, for infection chains (since February 22, 2020, and as of March 20, 2020). The process was driven by the condition that both the source and the target of a chain are officially COVID-19 confirmed cases. 2. Methods for processing the data: Descriptive statistics were computed on the attribute data. Network visualizations, using visone software package, were built from network data. 3. People involved with sample collection, processing, analysis and/or submission: Hâncean, Marian-Gabriel; Perc, Matjaž; Lerner, Jürgen DATA-SPECIFIC INFORMATION FOR: "Data from: Early spread of COVID-19 in Romania"" 1. Number of variables: (a) Information on the first 147 COVID-19 cases confirmed in Romania ("COVID 19 ROMANIA anonymized data.xlsx"): 13 variables (column vectors). Information on: counties, sex, age, citizenship, probable countries and places of infection, index case management arrival dates, detection dates and time windows was masked following Dryad human subjects data standards as well as the recommendations for preparing raw clinical data for publication available in Hrynaszkiewicz et al. 2009(*). (b) Attributes of the patients embedded in the first COVID-19 transmission networks ("COVID 19 ROMANIA anonymized data.xlsx"): 12 variables (column vectors). Information on: counties, sex, age, citizenship, probable countries and places of infection, detection dates, modes of COVID-19 contracting was masked following Dryad human subjects data standards as well as the recommendations for preparing raw clinical data for publication available in Hrynaszkiewicz et al. 2009(*). (c) Human to human COVID-19 transmission networks ("COVID 19 ROMANIA anonymized data.xlsx"): 1 relational variable (squared adjacency matrix). (*) Hrynaszkiewicz,I., Norton, M.L., Vickers, A.J., Altman, D.G. Preparing raw clinical data for publication: guidance for journal editors, authors, and peer reviewers. BMJ 2010; 340:c181 (doi: 10.1136/bmj.c181) 2. Number of cases/rows: (a) Information on the first 147 COVID-19 cases confirmed in Romania ("COVID 19 ROMANIA anonymized data.xlsx"): 147 cases. (b) Attributes of the patients embedded in the first COVID-19 transmission networks ("COVID 19 ROMANIA anonymized data.xlsx"): 159 cases. (c) Human to human COVID-19 transmission networks ("COVID 19 ROMANIA anonymized data.xlsx"): 159 by 159 cases. 3. Variable List: "COVID 19 ROMANIA anonymized data.xlsx" (a) Information on the first 147 COVID-19 cases confirmed in Romania number variable label Variable definition Direct identifier Indirect identifier Maskimg method 1 case_number case number NO NO No action required 2 case_code case code used to distinguish index cases among the first cases NO NO No action required 3 index_case index case status (yes = index case) NO NO No action required 4 Romanian_county_anonymization_code Romanian county NO YES Masked by assigning code numbers 5 sex_anonymisation_code Sex NO YES Masked by assigning code numbers 6 age Age NO YES Masked by assigning age interval 7 citizenship_anonymization_code Citizenship NO YES Masked by assigning code numbers 8 probable_country_of_infection_anonymization_code Probable country of infection NO YES Masked by assigning code numbers 9 probable_place_of_infection_anonymization_code Probable place of infection NO YES Masked by assigning code numbers 10 index_case_management_annonymization_code How did authorities manage the index case NO Possible Masked by assigning code numbers 11 arrival_date_to_county_anonymization The date when the index case arrived to the county NO Possible Masked by assigning code numbers 12 COVID_19_detection_date_anonymization The date when the case was COVID19 confirmed (detected) NO Possible Masked by assigning code numbers 13 time_window_anonymisation_code Number of days between arrival and detection NO Possible Masked by assigning code numbers (b) Attributes of the patients embedded in the first COVID-19 transmission networks number variable label Variable definition Direct identifier Indirect identifier Maskimg method 1 id case number NO NO No action required 2 network_seed network seed status (Yes = the node is a seed in the network) NO NO No action required 3 Romanian_county_anonymisation_code Romanian county NO YES Masked by assigning code numbers 4 sex _anonymisation_code Sex NO YES Masked by assigning code numbers 5 age_interval Age NO YES Masked by assigning age interval 6 citizenship_anonymisation_code Citizenship NO YES Masked by assigning code numbers 7 probable_country_of_infection_anonymisation_code Probable country of infection NO Possible Masked by assigning code numbers 8 probable_place_of_infection_anonymisation_code Probable place of infection NO Possible Masked by assigning code numbers 9 detection_anonymisation_code The date when the case was COVID-19 confirmed (detected) NO Possible Masked by assigning code numbers 10 detection_1 Number of days starting from the first COVID-19 case and until the last case NO NO No action required 11 cluster_type_anonymisation_code Mode of contracting COVID-19 (detailed) NO Possible Masked by assigning code numbers 12 cluster_class_anonymisation_code Mode of contracting COVID-19 (classification) NO Possible Masked by assigning code numbers (c) Human to human COVID-19 transmission networks COVID-19 transmission networks are collected as a squared adjancecy matrix (159 * 159) COVID-19 sources are listed on rows while COVID-19 targets are listed as columns. The networks are directed, i.e. (i,j)≠(j,i). For (i,j) = 1, node i infected node j. For (i,j) = 0, node I did not infect node j. 4. Specialized formats or other abbreviations used: The code book is available at Hâncean, Marian-Gabriel (gabriel.hancean@sas.unibuc.ro)