Background: Gene conversion of duplicated genes can slow the divergence of paralogous copies over time but can also result in other interesting evolutionary patterns. Islands of genetic divergence that persist in the face of gene conversion can point to gene regions undergoing selection for new functions. Novel combinations of genetic variation that differ greatly from the original sequence can result from the transfer of genetic variation between paralogous genes by rare gene conversion events. Genetically divergent populations of the copepod Tigriopus californicus provide an excellent model to look at the patterns of divergence among paralogs across multiple independent evolutionary lineages. Results: In this study the evolution of a set of paralogous genes encoding putative aspartate transaminase proteins (called GOT1 here) are examined in populations of the copepod T. californicus. One pair of duplicated genes, GOT1p1 and GOT1p2, has regions of high divergence between the copies in the face of apparent on-going gene conversion. The GOT1p2 gene also has unique haplotypes in two populations that appear to have resulted from a transfer of genetic variation via inter-paralog gene conversion. A second pair of duplicated genes GOT1Sr and GOT1Sd also shows evidence of gene conversion, but this gene conversion does not appear to have maintained each as a functional copy in all populations. Conclusions: The patterns of conservation and sequence divergence across this set of paralogous genes among populations of T. californicus suggest that some interesting evolutionary patterns are occurring at these loci. The results for the GOT1p1/GOT1p2 paralogs illustrate how gene conversion can factor in the creation of a mosaic pattern of regions of high divergence and low divergence. When coupled with rare gene conversion events of divergent regions, this pattern can result in the formation of novel proteins differing substantially from either original protein. The evolutionary patterns across these paralogs show how gene conversion can both constrain and facilitate diversification of genetic sequences.
GOT1_6a_align
Alignment of sequences from the GOT1_6a gene from Tigriopus californicus populations. The population from which sequence was obtained is indicated by the SD, AB, SCN, or LJS label. The GOT1_6A_mRNA_SD sequence was obtained from cDNA sequence from the SD population and spans a larger region than the remaining sequences. Other sequences were obtained by direct sequencing of PCR products from a single copepod. The two alleles of a single individual are indicated with a and b designations. Phase of multiple mutations in a single individual was not determined experimentally.
GOT1p1_p2
Alignment of GOT1p1 and GOT1p2 paralogs from populations of the copepod Tigriopus californicus. All sequences are the result of direct sequencing off PCR products from genomic DNA isolated from single copepods. More details can be found in paper but each paralog was amplified by using primers that specifically amplified that paralog exclusively. Paralog is indicated by the p1 or p2 in the sequence and the population from which sequence was obtained is indicated by the AB, SD, SCN, or LJS label. The two alleles from a single individual are indicated with an A and B. In some cases two copies of each paralog were obtained from the same individual.
GOT1Sd_r_alignable_regions
This is a nexus file that provides a sequence alignment for the regions of the GOT1Sd and GOT1Sr that are alignable (and a region from 470 to 814 that does not align well in the middle). The 5-prime end of the alignment starts at position 1037 of the GOT1Sr alignment. The middle section of that is not alignable has been removed from this alignment. For the GOT1Sd gene after 539bp of sequence 282bp has been removed. For the GOT1Sr gene after 550bp a large fragment of this intron sequence has been removed (between 2271bp for the SD population and 3411 for the SCN population).
GOT1SD_R_alignable_regions.txt
GOT1Sr_AB_SD_LJ
This file is an alignment of GOT1Sr from the SD, AB, and LJS populations. The SCN population is included as separate file because the large intron does not align well with the intron from these populations. Each sequence indicates the population with the AB, SD, or LJS labels. Sequences labeled with cDNA were sequenced from cDNA while sequences saying clone were sequenced from cloned PCR products from a single individual. The remaining sequences were from direct sequencing of PCR products from a single copepod. The two alleles are given the same label but the second allele is given an 'a' appendage. The phase of the mutations within an individual was not determined experimentally.
GOT1Sr_SCN
This file is an alignment of GOT1Sr from the SCN population. The SCN population is included as separate file because the large intron does not align well with the intron from these populations. The sequence labeled with cDNA was sequenced from cDNA while the sequence saying clone was sequenced from cloned PCR products from a single individual. The remaining sequences were from direct sequencing of PCR products from a single copepod. The two alleles are given the same label but the second allele is given an 'a' appendage. The phase of the mutations within an individual was not determined experimentally.
GOT1Sd_align
Alignment of GOT1Sd sequences from populations of the copepod Tigriopus californicus. This got1s_SDcomplete_m sequence is a cDNA sequence from the GOT1sr gene from the SD population that has been included for a reference. The population origin of the other sequences is indicated by the SD, SCN, AB, or LJS label. Sequences are from direct sequencing of PCR products obtained from a single individual. Two alleles from the same individual are indicated with the a or b labels (with the remainder of the label identical).