A number of files are used to analyze our data. #a program that processed our original variant call file (.vcf) to remove low quality SNPs and those with >2 bases segregating #the output file retains read depths for REF and ALT nucleotides in a simple format 1. program1_bees_new.py input: file new_IM.vcf output: 'new1_IM.txt' DATA FORMAT of new1_IM.txt: Each row in this dataset corresponds to a single nucleotide polymorphism (SNP) The colums in this dataset correspond to: 'CHROM' = Mimulus guttatus chromosome 'POS' = Position of SNP on a chromosome 'REF' = Identity of reference nucleotide 'ALT' = Identity of alternative nucleotide 'a1_pool' = The number of REF bases followed by the number of ALT bases in the A1 DNA pool (comma delimited) 'a2_pool' = The number of REF bases followed by the number of ALT bases in the A2 DNA pool (comma delimited) 'b1_pool' = The number of REF bases followed by the number of ALT bases in the B1 DNA pool (comma delimited) 'b2_pool' = The number of REF bases followed by the number of ALT bases in the B2 DNA pool (comma delimited) #a program that removes SNPs based on minor allele frequency and read depth 2. program2_bees_new.py input: 'new1_IM.txt' output: 'new2_IM.txt' DATA FORMAT of 'new2_IM.txt': Each row in this dataset corresponds to a single nucleotide polymorphism (SNP) The colums in this dataset correspond to: 'CHROM' = Mimulus guttatus chromosome 'POS' = Position of SNP on a chromosome 'REF' = Identity of reference nucleotide 'ALT' = Identity of alternative nucleotide 'a1_pool' = Total read depth followed by the frequency of the REF allele in the A1 DNA pool (comma delimited) 'a2_pool' = Total read depth followed by the frequency of the REF allele in the A2 DNA pool (comma delimited) 'b1_pool' = Total read depth followed by the frequency of the REF allele in the B1 DNA pool (comma delimited) 'b2_pool' = Total read depth followed by the frequency of the REF allele in the B2 DNA pool (comma delimited) #a program that calculates the null divergence between pairs of populations 3. program3_bees_new.py input: 'new2_IM.txt' output: prints pairwise nulldivergence to screen #comparing evolutionary models at each SNP to get AICs #to speed calculation, this handles 1,000 SNPs in a window of the genome 4. model_fits.py this file receives three arguments when run: i. an integer (i) defining the window containing 1,000 SNPs in the genome ii. a datafile containing the processed SNPs (new2_IM.txt) iii. a datafile containing the nullvariance for each population, which is used to calculate likelihoods (nullvars_IM.txt) output: 'i.aic.emp.txt' and i.ML.emp.txt' *all files ending with 'aic.emp.txt' are combined using 'cat *.aic.emp.txt > 'i.aic.emp.csv' #calculating P-values in likelihood ratio tests to identify SNP outliers 5. calculate_Pvals.py input: 'i.aic.emp.csv' output: 'LRTs.txt' #neutral simulator: this founds populations using the SNP data and then simulates evolution without fitness variation 6. neutral.simulator.py input: 'new2_IM.txt' #selection simulators: these found populations using SNP data and then simulate evolution with selection 7. outcross.1locus.py this program simulates neutral evolution when there is strong selection at 1 locus and outcrossing input: 'new2_IM.txt' 8. outcross.2loci.py this program simulates neutral evolution when there is strong selection at 2 loci and outcrossing input: 'new2_IM.txt' 9. selfing.1locus.py this program simulates neutral evolution when there is strong selection at 1 locus and full selfing input: 'new2_IM.txt' 10. selfing.2loci.py this program simulates neutral evolution when there is strong selection at 2 loci and full selfing input: 'new2_IM.txt'