Skip to main content
Dryad

Evolution towards increasing complexity through functional diversification in a protocell model of the RNA world

Cite this dataset

Sengupta, Supratim; Roy, Suvam (2021). Evolution towards increasing complexity through functional diversification in a protocell model of the RNA world [Dataset]. Dryad. https://doi.org/10.5061/dryad.866t1g1qs

Abstract

The encapsulation of genetic material inside compartments together with the creation and sustenance of functionally diverse internal components are likely to have been key steps in the formation of ’live’, replicating protocells in an RNA world. Several experiments have shown that RNA encapsulated inside lipid vesicles can lead to vesicular growth and division through physical processes alone. Replication of RNA inside such vesicles can produce a large number of RNA strands. Yet, the impact of such replication processes on the emergence of the first ribozymes inside such protocells and on the subsequent evolution of the protocell population remains an open question. In this paper, we present a model for the evolution of protocells with functionally diverse ribozymes. Distinct ribozymes can be created with small probabilities during the error-prone RNA replication process via the rolling circle mechanism. We identify the conditions that can synergistically enhance the number of different ribozymes inside a protocell and allow functionally diverse protocells containing multiple ribozymes to dominate the population. Our work demonstrates the existence of an effective pathway towards increasing complexity of protocells that might have eventually led to the origin of life in an RNA world.

Methods

The data was collected through computer simulations using Python. The detailed algorithm is given in  the electronic supplementary material (ESM) file associated with the paper titled "Roy-Sengupta_figures_tables_ESM". All the parameter values used for generating the specific figures given in the main text as well as in the ESM are specified in the respective figure captions.

Usage notes

Python codes and data files used for generating results given in the main text:

Main_code: Can be used to generate the data files for Fig-2, 3, 4, S6, S7, S8 and S9 by properly modifying the code. The current
Code will generate the data files for Fig-4

data_Fig_2A : Actually the data file for Fig.2C; Column-1 corresponds to time, Column-2, 6, 7, 9 gives the abundance of RC, R, C and null protocells

data_Fig_2B : Data file for Fig.2B. Column-1 corresponds to time, Column-2, 6, 7, 9 gives the abundance of RC, R, C and null protocells

data_Fig_2C : Data file for Fig.2A. Column-1 corresponds to time, Column-2, 6, 7, 9 gives the abundance of RC, R, C and null protocells

heatmap-data_Fig_2D : Data file for Fig.2D

data_Fig_3A : Data file for Fig.3A. Column-1 stands for time,  Column-2, 3, 4, 5, 6, 7, 8, 9 for abundance of RC, CN, RN, RCN, R, C, N and null protocells

data_Fig_3B : Data file for Fig.3B.  Column-1 stands for time,  Column-2, 3, 4, 5, 6, 7, 8, 9 for abundance of RC, CN, RN, RCN, R, C, N and null protocells

data_Fig_3C : Data file for Fig.3C. Column-1 stands for time,  Column-2, 3, 4, 5, 6, 7, 8, 9 for abundance of RC, CN, RN, RCN, R, C, N and null protocells

data_Fig_3D : Data file for Fig.3D. Column-1 stands for time,  Column-2, 3, 4, 5, 6, 7, 8, 9 for abundance of RC, CN, RN, RCN, R, C, N and null protocells

data_Fig_4A : Data file for Fig.4A with k=30. Column-1, 4 stands for time and average volume respectively

data_Fig_4B : Data file for Fig.4B. Column-1, 2, 3 stands for time, abundance of RCNP and null protocells respectively.

data_Fig_4C_4A : Data file for Fig.4C and also for Fig. 4A with k=20. Column-1, 2, 3, 4 stands for time, abundance of RCNP and null protocells and average volume respectively.

=========================================

Python codes and data files used for generating results given in the Supplementary material

Python Codes

Code_for_data_Fig_S1.py : Code used to generate data for Fig.S1

Code_for_data_Fig_S2.py : Code used to generate data for Fig.S2

Code_Fig_S4.py : Code to generate Fig.S4

Code_Fig_S5.py : Code to generate Fig.S5

Data Files

data_Fig_S1.txt

data file for Fig-S1. Column-1, 2 stands for temperature and probability of exponential growth respectively

data_Fig_S2A_t_27.txt

data file for Fig-S2A for temp=27 C. Column-1, 2 stands for time and number of strands respectively

data_Fig_S2A_t_37.txt

data file for Fig-S2A for temp=37 C. Column-1, 2stands for time and number of strands respectively

data_Fig_S2A_t_45.txt

data file for Fig-S2A for temp=45 C. Column-1, 2 stands for time and number of strands respectively

data_Fig_S2A_t_50.txt

data file for Fig-S2A for temp=50 C. Column-1, 2 stands for time and number of strands respectively

data_Fig_S2A_t_55.txt

data file for Fig-S2A for temp=55 C. Column-1, 2 stands for time and number of strands respectively

data_Fig_S2B_t_27.txt

data file for Fig-S2B for temp=27 C. Column-1, 2 stands for time and number of strands respectively

data_Fig_S2B_t_37.txt

data file for Fig-S2B for temp=37 C. Column-1, 2 stands for time and number of strands respectively

data_Fig_S2B_t_45.txt

data file for Fig-S2B for temp=45 C. Column-1, 2 stands for time and number of strands respectively

data_Fig_S2B_t_50.txt

data file for Fig-S2B for temp=50 C. Column-1, 2 stands for time and number of strands respectively

data_Fig_S2B_t_55.txt

data file for Fig-S2B for temp=55 C. Column-1, 2 for time and number of strands respectively

data_Fig_S6A.txt

data file for Fig-S6A. Column-1, 2, 3 for time,
Abundance of RCN and null protocells respectively

data_Fig_S6B.txt

data file for Fig-S6B. Column-1, 2, 3 for time,
Abundance of RCN and null protocells respectively

data_Fig_S6C.txt

data file for Fig-S6. Column-1, 2, 3 for time,
Abundance of RCN and null protocells respectively

data_Fig_S7_S8_dividing_cells.txt

data file for Fig-S7,S8 in case of dividing cells. Column-1, 2, 3, 4, 5, 6 stands for number of replicase, cyclase, nucleotide synthase, peptidyl transferase, all ribozymes and non-catalytic open-ended strands inside a dividing cell respectively

data_Fig_S7_S8_dying_cells.txt

data file for Fig-S7,S8 in case of dying cells. Column-1, 2, 3, 4, 5, 6 for number of replicase, cyclase, nucleotide synthase, peptidyl transferase, all ribozymes and non-catalytic Open-ended strands inside a dividing cell respectively

data_Fig_S9A.txt

data file for Fig-S9A. Column-1, 2, 3 stands for time,
Abundance of RCNP and null protocells respectively

data_Fig_S9B.txt

data file for Fig-S9B. Column-1, 2, 3 stands for time,
Abundance of RCNP and null protocells respectively

rate0.txt

Template-directed primer extension rates after a match
For different templating bases and incoming monomers

rate1.txt

Template-directed primer extension rates after a mis-match for different templating bases and incoming monomers