General Information =================== 1. R scripts and datasets from article: "Impact of male trait exaggeration on sex-biased gene expression and genome architecture in a water strider" BMC Biology: https://doi.org/10.1101/2020.01.10.901322 2. Contact information Abderrahman Khila : abderrahman.khila@ens-lyon.fr 3. Raw data: BioProject ID: PRJNA610161 and PRJNA642372 File/Directory structure ======================== . ├── datasets │   ├── blast2go_go_table_TopGO.txt │   ├── consensus.tree │   ├── DHE_genes_L2vsL1_Big&Small_F_DESeq2_line&replicate_correction.txt │   ├── DHE_genes_L2vsL1_Big&Small_M_DESeq2_line&replicate_correction.txt │   ├── DHE_genes_L3vsL1_Big&Small_F_DESeq2_line&replicate_correction.txt │   ├── DHE_genes_L3vsL1_Big&Small_M_DESeq2_line&replicate_correction.txt │   ├── DHE_genes_L3vsL2_Big&Small_F_DESeq2_line&replicate_correction.txt │   ├── DHE_genes_L3vsL2_Big&Small_M_DESeq2_line&replicate_correction.txt │   ├── DHE_genes_MvsF_Big&Small_L1_DESeq2_line_correction.txt │   ├── DHE_genes_MvsF_Big&Small_L2_DESeq2_line_correction.txt │   ├── DHE_genes_MvsF_Big&Small_L3_DESeq2_line_correction.txt │   ├── gene_count_matrix.csv │   ├── Gene_position.csv │   ├── L3-spe_male-biased.csv │   ├── Measures_leg-lengths_lines.csv │   ├── Microvelia_dSdN_LCA.csv │   ├── PRANK+GBLOCKS.tar.gz │   ├── transcript_id_FPKM.csv │   └── transcript_id_TPM_revised_ms.csv ├── README ├── R_Markdown_script_dnds.nb.html ├── R_Markdown_script_dnds.Rmd ├── R_Markdown_script_Fisher_test.nb.html ├── R_Markdown_script_Fisher_test.Rmd ├── Script_boxplot_logFC_all_legs_revised_ms.R ├── Script_DESeq2_revised_genomic_ms.R ├── Script_dosage_compensation_revised_ms.R ├── Script_GO-enrichment_revised_genomic_ms.R ├── Script_heatmap_sex-biased_L3-specific_braker_revised_ms.R ├── Script_interaction_plot_sex-biased_expression_all_legs_revised_ms.R ├── Script_leg_length_between_lines.R └── Script_PCA_revised_genomic_ms.R Datasets description ==================== blast2go_go_table_TopGO.txt : GO annotations based on Blast2GO to do GO enrichment analysis using TopGO R package. consensus.tree : Phylogenetic tree of Microvelidae genus in Newick format used as guide tree for dN/dS calculation. DHE_genes_L2vsL1_Big&Small_F_DESeq2_line&replicate_correction.txt : DESEq2 analysis results table of L2 vs L1 in Females DHE_genes_L2vsL1_Big&Small_M_DESeq2_line&replicate_correction.txt : DESEq2 analysis results table of L2 vs L1 in Males DHE_genes_L3vsL1_Big&Small_F_DESeq2_line&replicate_correction.txt : DESEq2 analysis results table of L3 vs L1 in Females DHE_genes_L3vsL1_Big&Small_M_DESeq2_line&replicate_correction.txt : DESEq2 analysis results table of L3 vs L1 in Males DHE_genes_L3vsL2_Big&Small_F_DESeq2_line&replicate_correction.txt : DESEq2 analysis results table of L3 vs L2 in Females DHE_genes_L3vsL2_Big&Small_M_DESeq2_line&replicate_correction.txt : DESEq2 analysis results table of L3 vs L2 in Males DHE_genes_MvsF_Big&Small_L1_DESeq2_line_correction.txt : DESEq2 analysis results table of Male vs Females in L1 DHE_genes_MvsF_Big&Small_L2_DESeq2_line_correction.txt : DESEq2 analysis results table of Male vs Females in L2 DHE_genes_MvsF_Big&Small_L3_DESeq2_line_correction.txt : DESEq2 analysis results table of Male vs Females in L3 gene_count_matrix.csv : CSV table with raw counts per gene in males and females legs for each line. Example : L3.M.B.R1 : Leg_3 Males Big_line Replicate_1 Gene_position.csv : CSV table with gene coordinates. Example: Gene_name;Scaffold_name;Start_position;Stop_position L3-spe_male-biased.csv : List of L3 biased and male biased genes from Figure 2 Measures_leg-lengths_lines.csv : CSV table with legs sizes (in micrometers) in males and females legs. Example : L1_M_B : Leg_1 Males Big_line Microvelia_dSdN_LCA.csv : CSV tabulated table with dN and dS values for RBH genes between Microvelia longipes and other Microvelia species. One calculation for each gene in PRANK+GBLOCKS folder. It includes one column per calculation and species (i.e. dS.Mic_ame and dN.Mic_ame). It also includes a final column indicating which species had a copy of the gene. Abbreviations: Mic_ame: Microvelia americana; Mic_aya: Microvelia ayacuchana; Mic_Cal: Microvelia paludicula; Mic_Cay: Microvelia sp.; Mic_lon: Microvelia longipes; Mic_pul: Microvelia pulchella) PRANK+GBLOCKS.tar.gz : Gene alignments used to calculate dN/dS as in: Guéguen L, Duret L (2018). Unbiased estimate of synonymous and non-synonymous substitution rates with non-stationary base composition. Molecular biology and evolution, vol. 35 pp.734-742. transcript_id_FPKM.csv : CSV table with FPKM counts per gene in males and females legs for each line. Example : L3_M_B_R1 : Leg_3 Males Big_line Replicate_1 transcript_id_TPM_revised : CSV table with TPM counts per gene in males and females legs for each line. Example : L3_M_B_R1 : Leg_3 Males Big_line Replicate_1 Files description ================= R_Markdown_script_dnds.nb.html : Output html file from R_Markdown_script_dnds.Rmd ---------------- R_Markdown_script_dnds.Rmd : Script to analyse dN/dS results. Includes 3d and 2d plotting of leg biased / sex biased genes (Figure 3). Files used: consensus.tree Microvelia_dSdN_LCA.csv DHE_genes_MvsF_Big&Small_L3_DESeq2_line_correction.txt DHE_genes_MvsF_Big&Small_L2_DESeq2_line_correction.txt DHE_genes_MvsF_Big&Small_L1_DESeq2_line_correction.txt DHE_genes_L3vsL1_Big&Small_F_DESeq2_line&replicate_correction.txt DHE_genes_L3vsL1_Big&Small_M_DESeq2_line&replicate_correction.txt DHE_genes_L2vsL1_Big&Small_F_DESeq2_line&replicate_correction.txt DHE_genes_L2vsL1_Big&Small_M_DESeq2_line&replicate_correction.txt ---------------- R_Markdown_script_Fisher_test.nb.html : Output html file from R_Markdown_script_Fisher_test.Rmd ---------------- R_Markdown_script_Fisher_test.Rmd : Script to calculate exact Fisher tests values for Figure 3. Files used: DHE_genes_MvsF_Big&Small_L3_DESeq2_line_correction.txt DHE_genes_MvsF_Big&Small_L2_DESeq2_line_correction.txt DHE_genes_MvsF_Big&Small_L1_DESeq2_line_correction.txt DHE_genes_L3vsL1_Big&Small_F_DESeq2_line&replicate_correction.txt DHE_genes_L3vsL1_Big&Small_M_DESeq2_line&replicate_correction.txt DHE_genes_L2vsL1_Big&Small_F_DESeq2_line&replicate_correction.txt DHE_genes_L2vsL1_Big&Small_M_DESeq2_line&replicate_correction.txt ---------------- Script_boxplot_logFC_all_legs_revised_ms.R : Plotting of DESeq2 results Files used: DHE_genes_MvsF_Big&Small_L3_DESeq2_line_correction.txt DHE_genes_MvsF_Big&Small_L2_DESeq2_line_correction.txt DHE_genes_MvsF_Big&Small_L1_DESeq2_line_correction.txt ---------------- Script_DESeq2_revised_genomic_ms.R : DESeq2 analysis Files used: transcript_id_FPKM.csv ---------------- Script_dosage_compensation_revised_ms.R : Lines (Big/Small) compensation correction. Files used: transcript_id_FPKM.csv Gene_position.csv ---------------- Script_GO-enrichment_revised_genomic_ms.R : GO enrichment analysis using TopGO R package and GO annotations based on Blast2GO. Files used: blast2go_go_table_TopGO.txt DHE_genes_MvsF_Big&Small_L3_DESeq2_line_correction.txt (or L2/L1) ---------------- Script_heatmap_sex-biased_L3-specific_braker_revised_ms.R : Script to generate heatmaps in Figure 2 Files used: L3-spe_male-biased.csv transcript_id_TPM_revised_ms.csv ---------------- Script_interaction_plot_sex-biased_expression_all_legs_revised_ms.R Files used: DHE_genes_MvsF_Big&Small_L3_DESeq2_line_correction.txt transcript_id_TPM_revised_ms.csv ---------------- Script_leg_length_between_lines.R : PCA analysis of leg length differences between samples Files used: Measures_leg-lengths_lines.csv ---------------- Script_PCA_revised_genomic_ms.R : PCA analyis of leg samples based on read counts Files used: gene_count_matrix.csv