Data from: Intra-individual polymorphism in chloroplasts from NGS data: where does it come from and how to handle it?
Data files
Sep 14, 2015 version files 7.03 GB
-
A571_Run4-R1-TAG-10.bam
89.92 MB
-
A571_Run7-R1-TAG-9.bam
32.38 MB
-
A571_RunHi-R1-TAG-9.bam
145.06 MB
-
CR659_Run10-R1-TAG-70.bam
23.35 MB
-
Dioscorea_Bcf_A571.vcf
4.83 MB
-
Dioscorea_Bcf.vcf
4.74 MB
-
Dioscorea_BcfB_A571.vcf
12.88 MB
-
Dioscorea_BcfB.vcf
12.58 MB
-
Dioscorea_BVar15_A571.vcf
7.51 MB
-
Dioscorea_BVar15.vcf
7.27 MB
-
Dioscorea_BVar50_A571.vcf
5.89 MB
-
Dioscorea_BVar50.vcf
5.71 MB
-
Dioscorea_Hap_A571.vcf
9.21 MB
-
Dioscorea_Hap.vcf
9.03 MB
-
Dioscorea_Uni_A571.vcf
15.39 MB
-
Dioscorea_Uni.vcf
14.97 MB
-
Dioscorea_Uni1_A571.vcf
4.07 MB
-
Dioscorea_Uni1.vcf
3.95 MB
-
Dioscorea_Var15_A571.vcf
1.52 MB
-
Dioscorea_Var15.vcf
1.47 MB
-
Dioscorea_Var50_A571.vcf
1.25 MB
-
Dioscorea_Var50.vcf
1.21 MB
-
filtcutRUN4-TAG-12_R1_paired.fastq
342.96 MB
-
filtcutRUN4-TAG-12_R2_paired.fastq
354.57 MB
-
filtcutRUN4-TAG-24_R1_paired.fastq
57.51 MB
-
filtcutRUN4-TAG-24_R2_paired.fastq
59.43 MB
-
filtcutRUN4-TAG-25_R1_paired.fastq
167.60 MB
-
filtcutRUN4-TAG-25_R2_paired.fastq
173.20 MB
-
filtcutRUN4-TAG-26_R1_paired.fastq
36.50 MB
-
filtcutRUN4-TAG-26_R2_paired.fastq
37.72 MB
-
filtcutRUN4-TAG-27_R1_paired.fastq
69.56 MB
-
filtcutRUN4-TAG-27_R2_paired.fastq
71.87 MB
-
filtcutRUN4-TAG-28_R1_paired.fastq
55.75 MB
-
filtcutRUN4-TAG-28_R2_paired.fastq
57.61 MB
-
filtcutRUN4-TAG-29_R1_paired.fastq
22.79 MB
-
filtcutRUN4-TAG-29_R2_paired.fastq
23.55 MB
-
filtcutRUN4-TAG-30_R1_paired.fastq
160.41 MB
-
filtcutRUN4-TAG-30_R2_paired.fastq
165.78 MB
-
filtcutRUN4-TAG-31_R1_paired.fastq
40.82 MB
-
filtcutRUN4-TAG-31_R2_paired.fastq
42.18 MB
-
P458_Run4-R1-TAG-25.bam
65.27 MB
-
Pa-Mayo9-R11T57_R1R2.fastq.bam
99.18 MB
-
Pa-Ndjo11-R11T19_R1R2.fastq.bam
67.28 MB
-
Pa-Ndjo3-R11T18_R1R2.fastq.bam
38.72 MB
-
Pa-Ndjo9-R11T16_R1R2.fastq.bam
47.53 MB
-
Pb-Aloum8-R11T73_R1R2.fastq.bam
74.95 MB
-
Pb-Campo1-R11T81_R1R2.fastq.bam
86.90 MB
-
Pb-Kola5-R11T54_R1R2.fastq.bam
42.23 MB
-
Pb-Oyem-R11T47_R1R2.fastq.bam
43.52 MB
-
Pb-Podo5-1-R2T53_R1R2.fastq.bam
535.97 KB
-
Pb-Podo5-R11T79_R1R2.fastq.bam
17.13 MB
-
PE08106-E1_cp_RUN5-TAG35.bam
10.17 MB
-
RUN2_TAG11-20.bam
9.34 MB
-
RUN2_TAG32.bam
8.03 MB
-
RUN3-TAG-55_R1_paired.fastq
56.01 MB
-
RUN3-TAG-55_R2_paired.fastq
57.88 MB
-
RUN3-TAG-55.bam
7.01 MB
-
RUN3-TAG-56_R1_paired.fastq
97.82 MB
-
RUN3-TAG-56_R2_paired.fastq
101.10 MB
-
RUN3-TAG-56.bam
6.87 MB
-
RUN3-TAG-57_R1_paired.fastq
79.98 MB
-
RUN3-TAG-57_R2_paired.fastq
82.66 MB
-
RUN3-TAG-57.bam
6.93 MB
-
RUN3-TAG-59_R1_paired.fastq
73.39 MB
-
RUN3-TAG-59_R2_paired.fastq
75.85 MB
-
RUN3-TAG-59.bam
7.07 MB
-
RUN3-TAG-65_R1_paired.fastq
33.11 MB
-
RUN3-TAG-65_R2_paired.fastq
34.21 MB
-
RUN3-TAG-65.bam
7.08 MB
-
RUN3-TAG-66_R1_paired.fastq
53.86 MB
-
RUN3-TAG-66_R2_paired.fastq
55.63 MB
-
RUN3-TAG-66.bam
7.04 MB
-
RUN3-TAG-67_R1_paired.fastq
40.78 MB
-
RUN3-TAG-67_R2_paired.fastq
42.14 MB
-
RUN3-TAG-67.bam
7.02 MB
-
RUN3-TAG-70_R1_paired.fastq
68.48 MB
-
RUN3-TAG-70_R2_paired.fastq
70.77 MB
-
RUN3-TAG-70.bam
6.99 MB
-
RUN4_TAG-12.bam
7.07 MB
-
RUN4_TAG-27.bam
7.36 MB
-
RUN4_TAG24.bam
7.35 MB
-
RUN4_TAG25.bam
7.24 MB
-
RUN4_TAG26.bam
7.53 MB
-
RUN4_TAG28.bam
7.30 MB
-
RUN4_TAG29.bam
7.46 MB
-
RUN4_TAG30.bam
7.23 MB
-
RUN5-TAG35_R1_paired.fastq.gz
1.42 GB
-
RUN5-TAG35_R2_paired.fastq.gz
1.36 GB
-
SNP_mil_HET50-MINFREQ-15.VCF
37.62 KB
-
SNP_PODO_HET50-MINFREQ-15.VCF
378.37 KB
-
SNP_rice_HET50-MINFREQ-0.VCF
109.25 KB
-
SNP_rice_HET50-MINFREQ-10.VCF
63.10 KB
-
SNP_rice_HET50-MINFREQ-15.VCF
49.97 KB
-
SNP_rice_HET50-MINFREQ-5.VCF
103.08 KB
-
SNP_yam_HET50-MINFREQ-15.VCF
337.71 KB
-
TOG6208_cp_RUN1_TAG1.bam
11.95 MB
-
TOG6208_cp_RUN1_TAG6.bam
13.39 MB
-
TOG6208_cp_RUN1-AG2-4_10-12.bam
13.30 MB
-
TOG6208_mt_Run1-TAG-1.bam
1.59 MB
-
TOG6208_mt_Run1-TAG-6.bam
10.72 MB
-
TOG6208_mt_Run1-TAG-LR.bam
33.41 MB
-
TOG6208_nr_Run1-TAG-1.bam
8.80 MB
-
TOG6208_nr_Run1-TAG-6.bam
57.22 MB
-
TOG6208_nr_Run1-TAG-LR.bam
421.75 KB
Abstract
Next generation sequencing allows access to a large quantity of genomic data. In plants, several studies used whole chloroplast genome sequences for inferring phylogeography or phylogeny. Even though the chloroplast is a haploid organelle, NGS plastome data identified a non negligible number of intra-individual polymorphic SNPs. Such observations could have several causes such as sequencing errors, the presence of heteroplasmy or transfer of chloroplast sequences in the nuclear and mitochondrial genomes. The occurrence of allelic diversity has practical important impacts on the identification of diversity, the analysis of the chloroplast data and beyond that, significant evolutionary questions. In this study, we show that the observed intra-individual polymorphism of chloroplast sequence data is probably the result of plastid DNA transferred into the mitochondrial and/or the nuclear genomes. We further assess nine different bioinformatics pipelines’ error rates for SNP and genotypes calling using SNPs identified in Sanger sequencing. Specific pipelines are adequate to deal with this issue, optimizing both specificity and sensitivity. Our results will allow a proper use of whole chloroplast NGS sequence and will allow a better handling of NGS chloroplast sequence diversity.