This README.txt file was generated on 2022-04-16 by Chai-Ann Ng GENERAL INFORMATION 1. Title of Dataset: Dataset for A Massively Parallel Assay Accurately Discriminates Between Functionally Normal and Abnormal Variants in a Hotspot Domain of KCNH2 2. Date of data collection: 2019-2021 3. Recommended citation for this dataset: (a) Ng et al. A Massively Parallel Assay Accurately Discriminates Between Functionally Normal and Abnormal Variants in a Hotspot Domain of KCNH2. American Journal of Human Genetics (2022); (b) Ng et al. Dataset for: A massively parallel assay accurately discriminates between functionally normal and abnormal variants in a hotspot domain of KCNH2, Dryad, Dataset, https://doi.org/10.5061/dryad.zpc866t9x DATA & FILE OVERVIEW 1. Description of dataset These data were generated to investigate what proportion of all potential KCNH2 variants in hotspot domains cause loss of function. Here, we have used a massively parallel trafficking assay to characterize all single-nucleotide variants in exon 2 of KCNH2, a known hotspot for variants that cause long QT syndrome type 2 and an increased risk of sudden cardiac death. Forty-two percent of KCNH2 exon 2 variants caused at least 50 % reduction in protein trafficking and 65% of these trafficking defective variants exerted a dominant-negative effect when co-expressed with a WT KCNH2 allele as assessed using a calibrated patch clamp electrophysiology assay. The massively parallel trafficking assay was more accurate (AUC of 0.94) than bioinformatic prediction tools (REVEL and CardioBoost, AUC of 0.81) in discriminating between functionally normal and abnormal variants. Interestingly, over half of variants in exon 2 were found to be functionally normal, suggesting a nuanced interpretation of variants in this ‘hotspot’ domain is necessary. Our massively parallel trafficking assay can provide this information prospectively. The patch-clamp dataset was collected using the Nanion Syncropatch 384PE automated patch clamp system by assessing the electrophysiology parameters of 458 KCNH2 SNVs expressed in HEK293 as heterozygote. Three voltage protocols were used to interrogate steady-state activation, steady-state deactivation (and recovery from inactivation) as well as rates of onset of inactivation. The parallel trafficking dataset was collected by sequencing flow-sorted cells that express a pool of different KCNH2 homozygous variants. The dataset contains the sequencing files and its associated barcoded key for identifying different KCNH2 variants. 2. File List: SyncroPatch data files: (2.1) Patch_Clamp_dataset_1.zip: 20 electrophysiology dataset collected in 2019 (2.2) Patch_Clamp_dataset_2.zip: 34 electrophysiology dataset collected between Feb-2020 and Jun-2020 (2.3) Patch_Clamp_dataset_3.zip: 20 electrophysiology dataset collected between Feb-2020 and July-2020 (2.4) Patch_Clamp_dataset_4.zip: 18 electrophysiology dataset collected between Aug-2020 and May-2021 (2.5) Patch-Clamp variant location.xlsx: plate ID and column location for different variants investigated for the above electrophysiology dataset Data file specific for Patch_Clamp_dataset_1.zip to Patch_Clamp_dataset_4.zip After unzipping, each folder (e.g. 03102019_AN) contains 7 folders and 1 zip file. These were the original folders and files generated by the Patch Control software during data acquisition using the SyncroPatch 384PE (Nanion Technologies). Option (A) If you have access to the software 'DataControl' (Nanion Technologies), simply point your path to the directory (e.g. Patch_Clamp_dataset_1) to inspect or analyse the data. Option (B) If you do not have access to the software 'DataControl', a seperate CSV files have been exported within each experiment (e.g. 03102019_AN) for each of the three protocols ((a) hERG_ssDeact_3s_AN (b) hERG_ssAct_1s_AN; (c) hERG_Inact_Onset_AN)). For example, one of these csv files can be located at "Patch_Clamp_dataset_1/03102019_AN/hERG_ssDeact_3s_AN_13.04.46/hERG_ssDeact_3s_AN_13.04.46/csv files" for hERG_ssDeact_3s_AN and similar paths for other protocols. These csv files can be analysed using your prefer electrophysiology software. Data file specific for Patch-Clamp variant location.xlsx Each SyncroPatch assay plate (e.g. 03102019_AN) contains 24 columns, which corresponding to 12 different cell lines (WT, 10 KCNH2 variant and negative control). Every 2 columns corresponds to a specific cell line (e.g. columns 1-2 for WT, columns 3-4 for variant I31M, etc). This excel file contains the number of plate, Plate ID and the name of the variants denoted by their respective column numbers. Trafficking data files: (2.6) 20200817-4959-BK.zip (2.7) 20200828-5125-RU.zip (2.8) 5239-RU-Devyn's sorted cells.zip (2.9) 20201005-5220-RU.zip (2.10) 20201006-5282-RU.zip (2.11) 20210115-5636-RU.zip (2.12) 20210104-5572-RU-Tile1-redo.zip (2.13) 20200121-5764-RU.zip (2.14) 20210308-5636-RU.zip (2.15) 20210324-6039-RU.zip (2.16) 20210126-5763-RU.zip (2.17) 20210401-6018-RU.zip (2.18) 20210304-5860-RU.zip (2.19) 20210521-6326-RU.zip (2.20) 20210504-6226-RU.zip (2.21) 20210407-6114-RU.zip (2.22) 20210422-6210-RU.zip (2.23) 20210520-6276-RU.zip (2.24) 20211018-7045-RU-5_pLV-363.zip (2.25) 20211209-7262-RU_tile1 sorted cells.zip (2.26) 20211122-7253-RU-tile1sortedcells.zip (2.27) 20210923-6932-RU_Reverse_tile1.zip (2.28) 20211018-7045-RU_sorted_cells.zip (2.29) 20211129-7181-RU-tile1sortedcells.zip (2.30) 20220303-7705-RU_sorted cels pLV363.zip (2.31) 20220228-7496-RU_sorted cells tile1.zip (2.32) 20220309-7727-RU_sorted cell_pLV376 (1).zip (2.33) 20220309-7733-RU sorted cels pLV376.zip An excel sheet ('Tile 1 sequencing details.xlsx') describing all of the parallel trafficking experiments is available with the barcode-key and list of sorted cell big fastq files. In house Python and R scripts to associate barcodes and variants across the tile for the parallel trafficking are available at https://github.com/kroncke-lab/KCNH2_DMS.