Skip to main content
Dryad

Computationally defined and in vitro validated putative genomic safe harbour loci for transgene expression in human cells

Cite this dataset

Autio, Matias et al. (2023). Computationally defined and in vitro validated putative genomic safe harbour loci for transgene expression in human cells [Dataset]. Dryad. https://doi.org/10.5061/dryad.p8cz8w9ww

Abstract

Selection of the target site is an inherent question for any project aiming for directed transgene integration. Genomic safe harbour (GSH) loci have been proposed as safe sites in the human genome for transgene integration. Although several sites have been characterised for transgene integration in the literature, most of these do not meet criteria set out for a GSH, and the limited set that do have not been characterised extensively. Here, we conducted a computational analysis using publicly available data to identify 25 unique putative GSH loci that reside in active chromosomal compartments. We validated stable transgene expression and minimal disruption of the native transcriptome in three GSH sites in vitro using human embryonic stem cells (hESCs) and their differentiated progeny. Furthermore, for easily targeted transgene expression, we have engineered constitutive landing pad expression constructs into the three validated GSH in hESCs.

README: Computationally defined and in vitro validated putative genomic safe harbour loci for transgene expression in human cells - HCI data from differentiated cells

https://doi.org/10.5061/dryad.p8cz8w9ww

High content imaging of Pansio-1, Olônne-18, and Keppel-19 H9 clones differentiated into neuronal, hepatic, and cardiac cells. HCI was done using a PerkinElmer Opera Phenix confocal imager with a 20x objective. 61 fields were imaged for each well and the channels Alexa 488, Alexa 594 and HOECHST 33342 were recorded. For each cell type the clones were seeded in parallel on three columns of a 96-well plate in the order Pansio-1, Olônne-18, and Keppel-19. Six rows of cells were seeded and stained in an alternating rows with first the appropriate antibody followed by the isotype control (3 rows each in total).

Description of the data and file structure

Individual .tif files for each channel and each field are compressed in the .zip files. Individual files are named 00X00X-X-00100100X. First 6-digits give you the position of the well row-by-column, the single digit tells the field within each well and the last digit distinguishes the channel recorded.

Each data set also contains a .csv and .xml files that contain different parameters related to each image recorded by the Opera Phenix imager (e.g. Channel, ImageResolution, MainEmissionWavelength,

Code/Software

Image analysis was conducted on PerkinElmer Columbus software. Details of the analysis pipelines used are found in the appendix of the related manuscript.

Methods

This dataset contains high content imaging data of Pansio-1, Olônne-18, and Keppel-19 H9 Clover hESC clones differentiated into neuronal, hepatic, and cardiac cell types and stained for lineage marker TUJ1, HNF4alpha, and cardiac Troponin-T. The cells were imaged on CellCarrier Ultra 96-well plates using a PerkinElmer Opera Phenix confocal imager with a 20x objective. 61 fields were imaged for each well and the channels Alexa 488, Alexa 594, and HOECHST 33342 were recorded. Image analysis was conducted on PerkinElmer Columbus software to look for overlap of Clover transgene and positive lineage markers.

Funding

Biomedical Research Council, Award: 1610851033

Agency for Science, Technology and Research, Award: 202D8020