Functional characterization of luciferase in a brittle star indicates parallel evolution influenced by genomic availability of haloalkane dehalogenase
Data files
May 13, 2025 version files 2.78 MB
-
Brittle_Star_Dryad_Data.zip
2.78 MB
-
README.md
8.02 KB
Abstract
Determining why convergent traits use distinct versus shared genetic components is crucial for understanding how evolutionary processes generate and sustain biodiversity. However, the factors dictating the genetic underpinnings of convergent traits remain incompletely understood. Here, we use heterologous protein expression, biochemical assays, and phylogenetic analyses to confirm the origin of a luciferase gene from haloalkane dehalogenases in the brittle star Amphiura filiformis. Through database searches and gene tree analyses, we also show a complex pattern of presence and absence of haloalkane dehalogenases across organismal genomes. These results first confirm parallel evolution across a vast phylogenetic distance, because octocorals like Renilla also use luciferase derived from haloalkane dehalogenases. This parallel evolution is surprising, even though previously hypothesized, because many organisms that also use coelenterazine as the bioluminescence substrate evolved completely distinct luciferases. The inability to detect haloalkane dehalogenases in the genomes of several bioluminescent groups suggests that the distribution of this gene family influences its recruitment as a luciferase. Together, our findings highlight how biochemical function and genomic availability help determine whether distinct or shared genetic components are used during the convergent evolution of traits like bioluminescence.
Directory structure and file list
Brittlestar_data_code.Rmd
Brittlestar_data_code.html
Rmarkdown_csv_files/
dehalogenase_data/
dehalogenase_activity_kinetics.csv
dehalogenases_37308.csv
temperature_37308.csv
emission_data/
313061_brittlestar_1_rawcal.csv
313061_brittlestar_2_rawcal.csv
313061_brittlestar_3_rawcal.csv
renilla_1_rawcal.csv
renilla_2_rawcal.csv
renilla_3_rawcal.csv
gene_expression_data/
gene_expression_development.csv
luciferase_data/
further_luciferase_testing.csv
Marika_crude_extracts_testing.csv
Oakley_crude_extracts_testing.csv
Bioinformatic_analyses/
Fig4_data/
queries_blastp_seq_domains_remove_seq_aligned.fa
queries_blastp_seq_domains_remove_seq_aligned.fa.log
queries_blastp_seq_domains_remove_seq_aligned.fa.treefile
queries_blastp_seq_domains_remove_seq.fa
Plasmid_sequences/
Marek_pET21b_10707-1.gb
Marek_pET21b_17859-1.gb
Marek_pET21b_37282-1.gb
Marek_pET21b_37332-1.gb
Marek_pET21b_DhaA.gb
oakley_pet21_37308-1.gb
oakley_pet21_203026.gb
oakley_pet21_224433.gb
oakley_pet21_AfLuc.gb
oakley_pet21_pyroluc.gb
oakley_pet21_renilla_luc.gb
File descriptions
Brittle_Star_Dryad_Data/Brittlestar_data_code.Rmd: R markdown file containing code used for data analysis and generating figures. File can be opened using R.
Brittle_Star_Dryad_Data/Brittlestar_data_code.html: html file containing code and output of the R markdown file
Brittle_Star_Dryad_Data/Rmarkdown_csv_files/: Contains data files for the functional assays. All files can be opened using programs such as text editor software or Excel.
dehalogenase_data/: Contains data files for the dehalogenase functional assays
dehalogenase_activity_kinetics.csv: Dataset for the dehalogenase kinetics assay. "Sample" column refers to the protein tested, "Minute" column refers to the minute at which the sample was measured, and "RFU" column indicates the Relative Fluorescence Unit measured.
dehalogenases_37308.csv: Dataset for testing DafA activity towards different substrates. "Substrate" column refers to the substrate tested, "Signal" and "Signal_SD" refers to the average and standard deviation of the fluorescence measurement in Relative Fluorescence Unit, and "Rate" and "Rate_SD" refers to the average and standard deviation of the calculated rate of reaction. Rate is in units of nmol s-1 mg-1.
temperature_37308.csv: Dataset for testing DafA activity at different temperatures with the substrate 1,2-dibromoethane. "Temperature" column refers to the temperature tested in Celsius, "Signal" and "Signal_SD" refers to the average and standard deviation of the fluorescence measurement in Relative Fluorescence Unit, and "Rate" and "Rate_SD" refers to the average and standard deviation of the calculated rate of reaction. Rate is in units of nmol s-1 mg-1.
emission_data/: Contains data files for the emission spectrum. For each file, "wavelength" column refers to the specific wavelength being measured, "emission_minus_background" refers to the luminescence emission measurement minus the background, and "calibrated_emission" refers to the calibrated luminescence emission measurement.
313061_brittlestar_1_rawcal.csv: Replicate 1 of AfLuc emission spectrum
313061_brittlestar_2_rawcal.csv: Replicate 2 of AfLuc emission spectrum
313061_brittlestar_3_rawcal.csv: Replicate 3 of AfLuc emission spectrum
renilla_1_rawcal.csv: Replicate 1 of RLuc emission spectrum
renilla_2_rawcal.csv: Replicate 2 of RLuc emission spectrum
renilla_3_rawcal.csv: Replicate 3 of RLuc emission spectrum
luciferase_data/: Contains data files for the luciferase functional assays
further_luciferase_testing.csv: Dataset for testing luciferase activity of all samples. "Concentration" column refers to the final concentration of luciferin used in uM, "Cycle" and "Time" column indicates the cycle and time the sample was measured in seconds, respectively, and headers of other columns are named for the sample in which luminescence was measured (Buffer, Bovine Serum Albumin (BSA), Pyrosome Luciferase (PyroLuc), AF37308 protein (B37308), AfLuc protein (B313061), Renilla luciferase (Rluc)). Luminescence is in Relative Luminescence Units.
Marika_crude_extracts_testing.csv: Dataset for testing crude extracts for DafA, A10707.1, A17859.1, A37282.1, A37332.1. "Sample" column contains the name of each sample being measured (negative control is DhaA, a haloalkane dehalogenase from bacteria, positive control is Rluc, Renilla luciferase, and experimental samples are candidates AF10707, AF17859, AF37282, AF37308, and AF37332). The "Luminescence" column contains the measured luminescence from each sample. Luminescence is in Relative Luminescence Units.
Oakley_crude_extracts_testing.csv: Dataset for testing crude extracts for AfLuc, Uni203026, Gen224433. "Well" column refers to the location of the sample in a 96 well sample plate, "Sample" column contains the name of each sample being measured (negative control is crude extract from bacteria transformed with an empty plasmid and experimental samples are candidates 224433, AfLuc (313061), and 20302.6). The "Luminescence" column contains the measured luminescence from each sample. Luminescence is in Relative Luminescence Units.
gene_expression_data/: Contains data files for HLD/LUC gene expression in A. filiformis
gene_expression_development.csv: Dataset for HLD/LUC gene expression in A. filiformis from Parey et al. 2024. "HPF" refers to Hours Post Fertilization. Numerical values are in log2(Transcripts Per Million + 1).
Brittle_Star_Dryad_Data/Bioinformatic_analyses/: Contains data files for phylogenetic analysis. Files can be opened with text editor software, and the .treefile can be visualized using FigTree.
Fig4_data/: Contains .fa files of unaligned and aligned hydrolase domains from dehalogenases, newick file of tree, and IQTREE log file
queries_blastp_seq_domains_remove_seq_aligned.fa: Multiple sequence alignment of hydrolase domains from dehalogenases
queries_blastp_seq_domains_remove_seq_aligned.fa.log: IQTREE log file
queries_blastp_seq_domains_remove_seq_aligned.fa.treefile: Newick file of the maximum likelihood tree
queries_blastp_seq_domains_remove_seq.fa: Sequences of hydrolase domains from dehalogenases
Brittle_Star_Dryad_Data/Plasmid_sequences/: Contains plasmid sequence .gb files. Files can be opened with plasmid editor tools, such as ApE.
Marek_pET21b_10707-1.gb: Plasmid sequence for AF10707.1 with 6His tag
Marek_pET21b_17859-1.gb: Plasmid sequence for AF17859.1 with 6His tag
Marek_pET21b_37282-1.gb: Plasmid sequence for AF37282.1 with 6His tag
Marek_pET21b_37332-1.gb: Plasmid sequence for AF37332.1 with 6His tag
Marek_pET21b_DhaA.gb: Plasmid sequence for DhaA with 6His tag
oakley_pet21_37308-1.gb: Plasmid sequence for DafA with 6His tag
oakley_pet21_203026.gb Plasmid sequence for Uni203026 with 6His tag
oakley_pet21_224433.gb: Plasmid sequence for Gen224433 with 6His tag
oakley_pet21_AfLuc.gb: Plasmid sequence for AfLuc with 6His tag
oakley_pet21_pyroluc.gb: Plasmid sequence for PyroLuc with 6His tag
oakley_pet21_renilla_luc.gb: Plasmid sequence for RLuc with 6His tag
NOTE: Gene sequences of proteins expressed in this paper are available on GenBank (Accessions: PP777633 (AfLuc), PP777634 (DafA), PP777635 (AF10707.1), PP777636 (AF17859.1), PP777637 (AF37282.1), PP777638 (AF37332.1), PP777639 (Gen224433), PP777640 (Uni203026), PP777641 (PyroLuc))