Control over Conformational Landscapes of Polypeptoids by Monomer Sequence Patterning
Data files
Mar 06, 2024 version files 1.66 MB
-
README.md
18.38 KB
-
zipped_files.zip
1.64 MB
Abstract
This dataset accompanies the article "Control Over Conformational Landscapes of Polypeptoids by Monomer Sequence Patterning" by Audra J. DeStefano, Shawn D. Mengel, Morgan W. Bates, Sally Jiao, M. Scott Shell, Songi Han, and Rachel A. Segalman in Macromolecules in 2024. The article demonstrates how patterning of hydrophobic residues along a polymer backbone can tune the distribution of end-to-end distances and that these effects can be predicted by a simple bead-spring simulation. This dataset contains the necessary experimental data to reproduce the main text and supporting figures. High performance liquid chromatographs, mass spectra, double electron-electron resonance time domain signals, and simulated end-to-end distance distributions are included.
README
This README.txt file was generated on 20230601 by Audra DeStefano and Shawn Mengel
GENERAL INFORMATION
Title of Dataset: Control Over Conformational Landscapes of Polypeptoids by Monomer Sequence Patterning
Author Information
A. Principal Investigator Contact Information
Name: Rachel A. Segalman
Institution: University of California Santa Barbara
Address: Department of Chemical Engineering, University of California, Santa Barbara, 93106
Email: segalman@ucsb.eduB. Associate or Co-investigator Contact Information
Name: Audra J. DeStefano
Institution: University of California Santa Barbara
Address: Department of Chemical Engineering, University of California, Santa Barbara, 93106
Email: adestefano@ucsb.eduC. Associate or Co-investigator Contact Information
Name: Shawn D. Mengel
Institution: University of California Santa Barbara
Address: Department of Chemical Engineering, University of California, Santa Barbara, 93106
Email: mengel@ucsb.eduD. Associate or Co-investigator Contact Information
Name: Morgan W. Bates
Institution: University of California Santa Barbara
Address: California NanoSystems Institute, University of California, Santa Barbara, 93106
Email: morganbates@ucsb.eduE. Associate or Co-investigator Contact Information
Name: Sally Jiao
Institution: University of California Santa Barbara
Address: Department of Chemical Engineering, University of California, Santa Barbara, 93106
Email: sjiao@ucsb.eduF. Associate or Co-investigator Contact Information
Name: M. Scott Shell
Institution: University of California Santa Barbara
Address: Department of Chemical Engineering, University of California, Santa Barbara, 93106
Email: shell@ucsb.eduG. Associate or Co-investigator Contact Information
Name: Songi Han
Institution: University of California Santa Barbara
Address: Department of Chemical Engineering, University of California, Santa Barbara, 93106
Email: songi@chem.ucsb.eduDate of data collection (single date, range, approximate date): 2022-03-01 to 2023-05-25
Geographic location of data collection: Santa Barbara, CA
Information about funding sources that supported the collection of the data: The polymer synthesis and characterization were supported by the National Science Foundation under Grant No. 2203179 (SDM, AJD, RAS) leveraging facilities and expertise from the BioPACIFIC Materials Innovation Platform of the National Science Foundation under Award No. DMR-1933487 (MWB). Development of the DEER technique and computational model was supported by the Center for Materials for Water and Energy Systems (M-WET), an Energy Frontier Research Center funded by the U.S. Department of Energy, Office of Science, Basic Energy Sciences under Award #DE-SC0019272. SDM and SJ acknowledge support from the National Science Foundation Graduate Research Fellowship (DGE 2139319, DGE 1650114). AJD acknowledges support from the Department of Defense through the National Defense Science & Engineering Graduate (NDSEG) Fellowship Program.
SHARING/ACCESS INFORMATION
Licenses/restrictions placed on the data: N/A
Links to publications that cite or use the data: https://doi.org/XXXXXXXX
Links to other publicly accessible locations of the data: N/A
Links/relationships to ancillary data sets: N/A
Was data derived from another source? no
Recommended citation for this dataset: DeStefano, Audra et al. (2023), Control Over Conformational Landscapes of Short Polypeptoids by Monomer Sequence Patterning, Dryad, Dataset,
DATA & FILE OVERVIEW
File List: Files are orgranized into folders based on figure/table number from the corresponding manuscript. Files are named as "Figure#_Technqiue_SampleID.csv".
A. Figure2: Model time domain data and end-to-end distance distribution to demonstrate DEER data processing.
a. Figure2_DEER_TimeDomain.csv
1. Variables: time (units=microseconds), background corrected data (units=arbitrary), calculated fit (units=arbitrary), offset (units=arbitrary), raw data (units=arbitrary)
2. Column names: time, Data, Fit, Offset, Raw
b. Figure2_DEER_PofR.csv
1. Variables: end-to-end distance (units=angstrom), end-to-end distance distribution (units=none), lower standard deviation of distribution (units=none), upper standard deviation of distribution (units=none)
2. Column names: Ree_nm, P(Ree), Lower STDEV, Upper STDEVB. Figure4: End-to-end distance distrubutions and average end-to-end distances for polypeptoids with increasing hydrophobic content. Files a-g share the same file structure.
a. Figure4_DEER_PofR_0H.csv
1. Variables: end-to-end distance (units=angstrom), end-to-end distance distribution (units=none), lower standard deviation of distribution (units=none), upper standard deviation of distribution (units=none)
2. Column names: Ree_nm, P(Ree), Lower STDEV, Upper STDEV
b. Figure4_DEER_PofR_1H.csv
c. Figure4_DEER_PofR_2H.csv
d. Figure4_DEER_PofR_3H.csv
e. Figure4_DEER_PofR_4H.csv
f. Figure4_DEER_PofR_0H_38.csv
g. Figure4_DEER_PofR_Ends_38.csv
h. Figure4_DEER_Ree.csv
1. Variables: sequence (units=none), hydrophobic fraction at chain ends (units=dimensionless), root-mean-squared end-to-end distance (units=angstrom)
2. Column names: sequence, hydrophobic_fraction, ReeC. Figure5: End-to-end distance distrubutions and average end-to-end distances for polypeptoids with varying hydrophobe patterning. Files a-g share the same file structure.
a. Figure5_DEER_PofR_Ends_38.csv
1. Variables: end-to-end distance (units=angstrom), end-to-end distance distribution (units=none), lower standard deviation of distribution (units=none), upper standard deviation of distribution (units=none)
2. Column names: Ree_nm, P(Ree), Lower STDEV, Upper STDEV
b. Figure5_DEER_PofR_Mid_38.csv
c. Figure5_DEER_PofR_2B_38.csv
d. Figure5_DEER_PofR_3B_38.csv
e. Figure5_DEER_PofR_6B_38.csv
f. Figure5_DEER_PofR_12B_38.csv
g. Figure5_DEER_PofR_1p5_38.csv
h. Figure5_DEER_Ree.csv
1. Variables: sequence (units=none), sequence chain length (units=residues), root-mean-squared average end-to-end distance (units=angstrom)
2. Column names: sequence, length, ReeD. Figure6: Comparison between root-mean-squared end-to-end distance distributions from DEER and bead-spring simulations.
a. Figure6_Simulation_Experiment_Ree.csv
1. Variables: Sequence (units=none), simulated average end-to-end distance (units=sigma), experimental average end-to-end distance (units=angstrom)
2. Column names: sequence, sim_Ree, exp_ReeE. Figure7: Statistical moments of polypeptoid 38mers.
a. Figure7_38mer_moments.csv
1. Variables: sequence (units=none), root-mean-squared average end-to-end distance (units=angstrom), variance divided by mean squared (units=dimensionless), skewness (units=dimensionless)
2. Column names: sequence, Ree, normalized_variance, skewnessF. FigureS1-19 - HPLC-MS chromatograms and mass spectrometry files for peaks of interest. Files a-s have same structure outlined for Figure_S2_HPLC_0H.csv. Files t-al have same structure outlined for Figure_S2_MS_0H.csv.
a. FigureS1_HPLC_0H.txt
1. Variables: retention time (units=minutes), intensity at 214nm (units=arbitrary)
2. Column names: R.Time, Intensity
b. FigureS2_HPLC_1H.txt
c. FigureS3_HPLC_2H.txt
d. FigureS4_HPLC_3H.txt
e. FigureS5_HPLC_4H.txt
f. FigureS6_HPLC_Mid.txt
g. FigureS7_HPLC_AB.txt
h. FigureS8_HPLC_2B.txt
i. FigureS9_HPLC_3B.txt
j. FigureS10_HPLC_6B.txt
k. FigureS11_HPLC_1p5.txt
l. FigureS12_HPLC_0H_38.txt
m. FigureS13_HPLC_Mid_38.txt
n. FigureS14_HPLC_Ends_38.txt
o. FigureS15_HPLC_2B_38.txt
p. FigureS16_HPLC_3B_38.txt
q. FigureS17_HPLC_6B_38.txt
r. FigureS18_HPLC_12B_38.txt
s. FigureS19_HPLC_1p5_38.txt
t. FigureS1_MS_0H.txt
1. Variables: mass to charge ratio (units=atomic mass unit per ion charge), intensity (units=arbitrary)
2. Column names: mz, intensity
u. FigureS2_MS_1H.txt
v. FigureS3_MS_2H.txt
w. FigureS4_MS_3H.txt
x. FigureS5_MS_4H.txt
y. FigureS6_MS_Mid.txt
z. FigureS7_MS_AB.txt
aa. FigureS8_MS_2B.txt
ab. FigureS9_MS_3B.txt
ac. FigureS10_MS_6B.txt
ad. FigureS11_MS_1p5.txt
ae. FigureS12_MS_0H_38.txt
af. FigureS13_MS_Mid_38.txt
ag. FigureS14_MS_Ends_38.txt
ah. FigureS15_MS_2B_38.txt
ai. FigureS16_MS_3B_38.txt
aj. FigureS17_MS_6B_38.txt
ak. FigureS18_MS_12B_38.txt
al. FigureS19_MS_1p5_38.txtG. FigureS20-38 - Time domain DEER. Raw, background corrected, and calculated traces are included. All files have the same structure outlined for FigureS20_DEER_TimeDomain_OH.csv
a. FigureS20_DEER_TimeDomain_0H.csv
1. Variables: time (units=microseconds), background corrected data (units=arbitrary), calculated fit (units=arbitrary), offset (units=arbitrary), raw data (units=arbitrary)
2. Column names: time, Data, Fit, Offset, Raw
b. FigureS21_DEER_TimeDomain_1H.csv
c. FigureS22_DEER_TimeDomain_2H.csv
d. FigureS23_DEER_TimeDomain_3H.csv
e. FigureS24_DEER_TimeDomain_4H.csv
f. FigureS25_DEER_TimeDomain_Mid.csv
g. FigureS26_DEER_TimeDomain_AB.csv
h. FigureS27_DEER_TimeDomain_2B.csv
i. FigureS28_DEER_TimeDomain_3B.csv
j. FigureS29_DEER_TimeDomain_6B.csv
k. FigureS30_DEER_TimeDomain_1p5.csv
l. FigureS31_DEER_TimeDomain_0H_38.csv
m. FigureS32_DEER_TimeDomain_Mid_38.csv
n. FigureS33_DEER_TimeDomain_Ends_38.csv
o. FigureS34_DEER_TimeDomain_2B_38.csv
p. FigureS35_DEER_TimeDomain_3B_38.csv
q. FigureS36_DEER_TimeDomain_6B_38.csv
r. FigureS37_DEER_TimeDomain_12B_38.csv
s. FigureS38_DEER_TimeDomain_1p5_38.csvH. FigureS39: Bead-spring molecular dynamics simulation of end-to-end distance distributions at T=0.75epsilon/k_B. All files have same structure outlined for FigureS39_simulation_Mid_20mer.txt
a. FigureS39_simulation_Mid_20mer.txt
1. Variables: end-to-end distance (units=sigma), probability distribution of the end-to-end distance (units=1/sigma), 2.5% confidence boundary for P(Ree) (units=1/sigma), 97.5% confidence boundary for P(Ree) (units=1/sigma)
2. Column names: Ree, y, error_low, error_high
b. FigureS39_simulation_0H_38mer.txt
c. FigureS39_simulation_1p5_38mer.txt
d. FigureS39_simulation_2B_38mer.txt
e. FigureS39_simulation_3B_38mer.txt
f. FigureS39_simulation_6B_38mer.txt
g. FigureS39_simulation_12B_38mer.txt
h. FigureS39_simulation_Ends_38mer.txt
i. FigureS39_simulation_Mid_38mer.txt
j. FigureS39_simulation_0H_20mer.txt
k. FigureS39_simulation_1H_20mer.txt
l. FigureS39_simulation_1p5_20mer.txt
m. FigureS39_simulation_2B_20mer.txt
n. FigureS39_simulation_2H_20mer.txt
o. FigureS39_simulation_3B_20mer.txt
p. FigureS39_simulation_3H_20mer.txt
q. FigureS39_simulation_4H_20mer.txt
r. FigureS39_simulation_6B_20mer.txt
s. FigureS39_simulation_AB_20mer.txtI. FigureS40: Comparison of simulated and experimental average end-to-end distances at multiple simulation temperatures.
a. FigureS40_Simulation_Experiment_Ree.csv
1. Variables: Sequence (units=none), Length of spin-labeled sequence (units=residues), Temperature (units=epsilon/k_B), simulated average end-to-end distance (units=sigma), experimental average end-to-end distance (units=angstrom)
2. Column names: sequence, length, temperature, sim_Ree, exp_ReeJ. FigureS41: Model distributions to demonstrate high/low variance and skewness.
a. FigureS41_IdealDistributions_Variance.csv
1. Variables: x-axis (units=none), Normal probability distribution function with low variance (units=none), Normal probability distribution function with high variance (units=none)
2. Column names: x, PDF_low_variance, PDF_high_variance
b. FigureS41_IdealDistributions_Skewness.csv
1. Variables: x-axis (units=none), Lognormal probability distribution function with low positive skewness (units=none), Lognormal probability distribution function with high positive skewness (units=none)
2. Column names: x, PDF_low_skew, PDF_high_skewK. FigureS42: Statistical moments of simulated end-to-end distance distributions.
a. FigureS42_simulation_moments.csv
1. Variables: sequence (units=none), root-mean-squared average end-to-end distance (units=sigma), variance divided by mean squared (units=none), skewness (units=none)
2. Column names: sequence, average_Ree, normalized_variance, skewnessL. FigureS43: Average end-to-end distances for polypeptoids with varying hydrophobe patterning.
a. FigureS43_DEER_Ree.csv
1. Variables: sequence (units=none), sequence chain length (units=residues), root-mean-squared average end-to-end distance (units=angstrom)
2. Column names: sequence, length, ReeM. FigureS44: Cw-EPR spectra for spin label and polypeptoid.
a. FigureS44_EPR_Spinlabel
1. Variables: index (units=none), magnetic field (units=Gauss), signal intensity (units=a.u.)
2. Column names: index, magnetic field, signal intensity
a. FigureS44_EPR_3H
1. Variables: index (units=none), magnetic field (units=Gauss), signal intensity (units=a.u.)
2. Column names: index, magnetic field, signal intensityN. TableS2: Sample masses obtained by HPLC-MS.
a. TableS2_HPLC-MS.csv
1. Variables: Label (units=none), chain length (units=number of monomers), sequence (units=none), theoretical molecular weight (grams/mole), experimental mass divided by charge number of ion (units=none), ion type (units=none)
2. Column names: Label, Chain Length, Sequence, Theoretical MW (g/mol), Exp. m/z, Ion typeO. TableS3: Correlation and intercept of the comparison between bead-spring model and DEER measurements.
a. TableS3_correlation_Ree.txt
1. Variables: Simulation temperature (units=epsilon/k_B), Pearson correlation coefficient (units=none), x-intercept (units=angstrom)
2. Column names: temperature, correlation, x_interceptRelationship between files, if important: N/A
Additional related data collected that was not included in the current data package: N/A
Are there multiple versions of the dataset? no
Experimental context of each technique:
- High-performance liquid chromatography mass spectrometry (HPLC-MS) is used to quantify the purity and molecular weights of the synthesized polypeptoids.
- Double electron-electron resonance (DEER) spectroscopy is an advanced electron paramagnetic resonance technique able to measure ensemble distributions between two nitroxide spin probes. We use this technique to characterize the end-to-end distance ensembles of polypeptoids with patterned hydrophobic sequences.
- Bead-spring model simulations are used to develop computational models that accurately predict the sequence effects measured experimentally.
METHODOLOGICAL INFORMATION
Description of methods used for collection/generation of data: <<<<
>>>> The mass and purity of the final polypeptoid products was confirmed by analytical HPLC mass spectrometry. Analytical HPLC-MS was performed using a C18 column (XBridge BEH C18, 300 angstroms, 5 micrometers, 4.6 millimeters by 150 millimeters) with a 5-95% solvent A gradient over 30 minutes, where solvent A was water with 0.1% v/v trifluoroacetic acid and solvent B was acetonitrile with 0.1% v/v trifluoroacetic acid.
End-to-end distance distributions were measured for each polypeptoid sequence using double electron-electron resonanance (DEER). Samples containing approximately 100 µM of polypeptoid were dissolved in 50/50 v/v D2O/d-THF to ensure full solubility across a range of hydrophobic contents. Solutions were cryo-protected with 30% deuterated glycerol, by volume. Then, approximately 40 µL of solution was loaded into a 3 mm OD, 2 mm ID quartz tube. Samples were flash frozen in liquid nitrogen immediately prior to performing DEER. A Bruker/ColdEdge FlexLine Cryostat (Model ER 4118HV-CF100) maintained the sample temperature at 60 K. All DEER spectra were obtained using a Bruker QT-II resonator with a pulsed Q-band Bruker E580 Elexsys spectrometer with an Applied Systems Engineering, Model 177 Ka 300 W TWT amplifier. The following four pulse DEER sequence was applied to all samples: πobs/2- τ1-πobs-(t-πpump)- (τ2-t)-πobs- τ2-echo. Nutation experiments were used to determine optimal observer pulses (approximately 10 ns for 90 pulses and 20 ns for 180 pulses). The linear chirp πpump frequency width was set at 80 MHz and its duration at 100 ns, while πobs was 90 MHz above the center of the pump frequency range. τ1 was set at 126 ns and τ2 was set at 6 µs. All DEER experiments were signal averaged over at least 10 averages.
Continuous-wave electron paramagnetic resonance spectra are measured at a concentration of 150 μM. A quartz round capillary tube of 0.60 mm inner diameter and 0.84 mm outer diameter is loaded with 3.5 μL of solution and sealed at one end with beeswax and at the other with Critoseal. The dispersive electron paramagnetic resonance (EPR) spectrum is obtained with a fixed frequency (9.8 GHz) at 25 dB, while the magnetic field is swept with a modulation frequency of 140.0 kHz and a modulation amplitude of 0.70 G.
Methods for processing the data:
Time-domain DEER signals were background corrected using the LongDistances software (https://sites.google.com/site/altenbach/labview-programs/epr-programs/long-distances). Distance distributions were extracted from time-domain DEER signals using “model-free” Tikhonov regularization, as implemented by the LongDistances software.
Instrument- or software-specific information needed to interpret the data: N/A
Standards and calibration information, if appropriate: N/A
Environmental/experimental conditions: N/A
Describe any quality-assurance procedures performed on the data: N/A
People involved with sample collection, processing, analysis and/or submission: Audra DeStefano, Shawn Mengel, Sally Jiao
Methods
Please refer to the methods section of the published main text for details on dataset collection and processing.