Excessive parallelism in protein evolution of Lake Baikal amphipod species flock
Data files
Jan 02, 2020 version files 19.79 MB
-
300_groups_of_4_species_with_Eulimnogammarus_verrucosus_P_test_raw_data.csv
355.99 KB
-
300_groups_of_4_species_with_Eulimnogammarus_verrucosus_P_test_results.csv
83.88 KB
-
900_groups_of_4_species_P_test_raw_data.csv
1.11 MB
-
900_groups_of_4_species_P_test_results.csv
274.58 KB
-
Cichlidae_sample_list.txt
1.61 KB
-
Gammaridae_cds_alignments.tar.gz
17.94 MB
-
Gammaridae_sample_list.txt
1.38 KB
-
Gammaridae_tree.newick
3.75 KB
-
Miyata_distance_freqs_for_parall_and_nonparall_sites.csv
3.05 KB
-
README.txt
2.34 KB
-
Sanger_sequences.zip
4.05 KB
-
Vertebrata_sample_list.txt
10.21 KB
Apr 14, 2020 version files 20.40 MB
-
300_cichlids_with_ecotypes_groups_of_4_species_P_test_raw_data.csv
399.32 KB
-
300_cichlids_with_ecotypes_groups_of_4_species_P_test_results.csv
108.98 KB
-
300_gammarids_convergent_vs_divergent_substitutions.csv
11.83 KB
-
300_gammarids_without_filtrations_P_test_mean.csv
24.33 KB
-
300_groups_of_4_species_with_Eulimnogammarus_verrucosus_P_test_raw_data.csv
355.99 KB
-
300_groups_of_4_species_with_Eulimnogammarus_verrucosus_P_test_results.csv
83.88 KB
-
900_groups_of_4_species_P_test_raw_data_new.csv
1.11 MB
-
900_groups_of_4_species_P_test_results_new.csv
273.86 KB
-
Cichlidae_sample_list.txt
1.61 KB
-
deepwater_and_shallow_water_gammarids_P_test_raw_data.csv
50.18 KB
-
deepwater_and_shallow_water_gammarids_P_test_results.csv
11.82 KB
-
Gammaridae_cds_alignments.tar.gz
17.94 MB
-
Gammaridae_sample_list.txt
1.38 KB
-
Gammaridae_tree.newick
3.75 KB
-
Miyata_distance_freqs_for_parall_and_nonparall_sites.csv
3.05 KB
-
README.txt
3.98 KB
-
Sanger_sequences.zip
4.05 KB
-
Vertebrata_sample_list.txt
10.21 KB
Abstract
Transcriptomic analysis:
We used the transcriptomic sequences of closely related gammarid species from Lake Baikal (Naumenko et al. 2017). Of the 67 species analyzed in that work, we picked the 47 species for which the sequenced sample was based on exactly one individual. Orthologous groups of genes were calculated with OrthoMCL 2.0.9 with the inflation parameter set to 1.5 (Li 2003). If a particular species carried multiple paralogous sequences of a gene, this species was excluded from the analysis of this gene. Codon-aware alignments for orthogroups were obtained with TranslatorX (Abascal et al. 2010) using the Muscle method (Edgar 2004). Poorly aligned sequences were detected and removed from the alignments using the following rule:
1) A column in an alignment was considered "good" if it carried the same nucleotide in at least 50% of species;
2) Sequences for which fewer than 50% positions were "good" were removed from the alignment.
This exclusion process was performed using TrimAl 1.4 (Capella-Gutierrez et al. 2009). Finally we obtained 4366 orthologous groups of genes. Alignments for all genes were concatenated, and a phylogenetic tree was reconstructed using RAxML 8.1.20 (Stamatakis 2014) with GTR+Gamma model, 20 starting maximum parsimony trees and 100 bootstrap analysis pseudoreplicates. As mutations in the third positions of codons are often synonymous, the third positions of codons accumulate substitutions quicker than the first two. Therefore, we used partitioning, with separate substitution models for the first two and for the third codon positions. The obtained tree was similar to that obtained previously.
Sanger sequencing:
Purified PCR products were bidirectionally sequenced on an ABI 3500 Genetic Analyzer (Applied Biosystems) using the BigDye Terminator v 3.1 Cycle Sequencing Kit (Applied Biosystems) and the same primers as for PCR.
In README.txt file the is description of provided data.