Gene expression differences in muscle and adipose tissue help explain variation in meat tenderness across USDA carcass grades in beef and fat casses in sheep
Data files
May 28, 2026 version files 76.66 MB
-
Combined_DEG_all_beef_cleaned.tsv
4.37 MB
-
Combined_DEG_all_sheep_cleaned.tsv
5.04 MB
-
fpkm_genename-2.csv
10.54 MB
-
fpkm_genename.csv
8.06 MB
-
fpkm_group-2.csv
9.76 MB
-
fpkm_group.csv
8.91 MB
-
fpkm_sample-2.csv
7.50 MB
-
fpkm_sample.csv
6.70 MB
-
readcount_genename-2.csv
6.08 MB
-
readcount_genename.csv
4.02 MB
-
readcount-2.csv
3.01 MB
-
readcount.csv
2.66 MB
-
README.md
15.29 KB
Abstract
This dataset comprises multi-modal genomic and phenotypic data from a comparative study of 30 livestock animals (15 Angus-cross beef steers and 15 Columbia-cross lamb wethers) raised to produce divergent carcass quality grades and fatness levels. The data were collected to investigate how gene expression in muscle and adipose tissues correlates with meat tenderness and carcass quality in beef versus lamb. Key contents include high-quality RNA sequencing (RNA-seq) data from longissimus muscle and subcutaneous adipose tissue (two tissues per animal; 60 samples total), alongside detailed carcass measurements for each individual. Recorded phenotypic traits encompass Warner–Bratzler shear force (WBSF) values as an objective measure of meat tenderness, hot carcass weight, backfat thickness, ribeye area, marbling score (beef), and flank fat streaking (lamb). All RNA samples had RIN ≥ 8.0, ensuring high-quality input material. The RNA-seq libraries were prepared with Illumina TruSeq chemistry and sequenced on a NovaSeq 6000 (150 bp paired-end), followed by quality trimming (FastQC/Trimmomatic), genome alignment to cattle (Bos taurus ARS-UCD1.2) or sheep (Ovis aries Oar_v4.0) references (STAR aligner), and gene-level quantification (featureCounts). Normalized expression values (TPM) and raw counts are provided for each sample.
Overview
This dataset contains RNA sequencing (RNA-seq) data and differential gene expression (DEG) results from sheep and beef cattle adipose and muscle tissues. The study investigates transcriptomic differences associated with variation in fat deposition and quality traits across biological groups.
The dataset includes:
-
Processed differential expression results
-
Full gene expression matrices (raw counts and normalized values)
Data Availability
Raw RNA-seq data have been deposited in the NCBI Sequence Read Archive (SRA):
- Sheep RNA-seq data: BioProject accession PRJNA1460905
- Beef RNA-seq data: BioProject accession PRJNA1462762
Associated BioSample and Run accessions are available within each BioProject.
Processed data supporting the findings of this study are available in this Dryad repository.
File Description
1. Differential Gene Expression Results
- Combined_DEG_all_sheep_cleaned.tsv
- Combined differential expression results for sheep
- Includes all genes identified as differentially expressed across comparisons
- Contains:
- gene identifiers and annotations
- log2 fold change
- p-values and adjusted p-values
- sample-level expression values
- comparison labels
- Combined_DEG_all_beef_cleaned.tsv
- Combined differential expression results for beef cattle
- Same structure and content as the sheep DEG file
2. Gene Expression Matrices
Sheep
readcount_genename.csv – Raw read counts with gene annotations
Column Definitions:
- gene_id: Unique gene identifier (Ensembl gene ID where applicable)
- OA* columns (e.g., OA9618, OA9643, OA9673, etc.):
- Sample-level raw read counts representing the number of sequencing reads mapped to each gene.
- Each column corresponds to an individual adipose tissue sample from sheep.
- Unit: raw read counts (integer values)
- OM* columns (e.g., OM9618, OM9628, OM9673, etc.):
- Sample-level raw read counts for individual muscle tissue samples from sheep.
- Unit: raw read counts (integer values)
- ROA*, ROM* columns (e.g., ROA9656, ROM9630):
- Sample-level raw read counts (integer values).
- These represent additional or replicate samples within adipose (ROA) and muscle (ROM) tissue groups.
- gene_name: Gene symbol or common gene name annotation
- gene_chr: Chromosome on which the gene is located
- gene_start: Genomic start position of the gene (base pairs)
- gene_end: Genomic end position of the gene (base pairs)
- gene_strand: DNA strand orientation (+ or −)
- gene_length: Length of the gene in base pairs
- gene_biotype: Gene classification (e.g., protein-coding, lncRNA, pseudogene)
- gene_description: Functional annotation or descriptive summary of the gene
- tf_family: Transcription factor family classification (if applicable; blank if not a transcription factor)
Notes:
These raw count data were used as input for differential gene expression analyses described in the DEG result files.
readcount.csv – Raw read counts (no annotation)
Column Definitions:
- gene_id: Unique gene identifier (Ensembl gene ID where applicable)
- OA* columns (e.g., OA9630, OA9618, OA9643, etc.):
- Sample-level raw read counts representing the number of sequencing reads mapped to each gene.
- Each column corresponds to an individual adipose tissue sample from sheep.
- Unit: raw read counts (integer values)
- OM* columns (e.g., OM9621, OM9618, OM9624, etc.):
- Sample-level raw read counts representing the number of sequencing reads mapped to each gene.
- Each column corresponds to an individual muscle tissue sample from sheep.
- Unit: raw read counts (integer values)
- ROA*, ROM* columns (e.g., ROA9656, ROM9630):
- Sample-level raw read counts (integer values).
- These represent additional or replicate samples within adipose (ROA) and muscle (ROM) tissue groups.
Notes:
This file contains raw count data only (no gene annotation fields).
Annotated versions of these data, including gene metadata (chromosome location, gene description, etc.), are provided in the file readcount_genename.csv
These raw count data were used as input for downstream normalization and differential gene expression analyses.
fpkm_sample.csv – Sample-level normalized expression (FPKM)
Column Definitions:
- gene_id: Unique gene identifier (Ensembl gene ID where applicable)
- OA* columns (e.g., OA9618, OA9643, OA9673, etc.):
- Sample-level normalized gene expression values (unit: FPKM – Fragments Per Kilobase of transcript per Million mapped reads).
- Each column represents an individual adipose tissue sample from sheep.
- OM* columns (e.g., OM9618, OM9628, OM9673, etc.):
- Sample-level normalized gene expression values (unit: FPKM).
- Each column represents an individual muscle tissue sample from sheep.
- ROA*, ROM* columns (e.g., ROA9656, ROM9630):
- Sample-level normalized gene expression values (unit: FPKM).
- These represent additional or replicate samples within the adipose (ROA) and muscle (ROM) tissue groups.
fpkm_group.csv – Group-level summarized FPKM values
Column Definitions:
- gene_id: Unique gene identifier (Ensembl gene ID where applicable)
Group-level columns:
- Two_Adp: Mean normalized expression (unit: FPKM) for adipose tissue samples in group "Two"
- Thre_Adp: Mean normalized expression (FPKM) for adipose tissue samples in group "Three"
- One_Adp: Mean normalized expression (FPKM) for adipose tissue samples in group "One"
- Two_Mus: Mean normalized expression (FPKM) for muscle tissue samples in group "Two"
- Thre_Mus: Mean normalized expression (FPKM) for muscle tissue samples in group "Three"
- One_Mus: Mean normalized expression (FPKM) for muscle tissue samples in group "One"
Individual sample columns (e.g., OM9621, OA9630, ROA9656, ROM9630, etc.):
- Sample-level normalized gene expression values (unit: FPKM – Fragments Per Kilobase of transcript per Million mapped reads).
- Each column represents an individual biological sample.
- Naming convention:
- O = group designation (see Experimental Design section for group definitions)
- A = adipose tissue samples
- M = muscle tissue samples
- R prefix (e.g., ROA, ROM) indicates replicate or additional samples
fpkm_genename.csv - FPKM with gene annotations
Column Definitions:
- gene_id: Unique gene identifier (Ensembl gene ID where applicable)
- OA*, OM*, ROA*, ROM* columns (e.g., OA9618, OM9628, ROA9656, ROM9630, etc.):
- Sample-level normalized gene expression values (unit: FPKM – Fragments Per Kilobase of transcript per Million mapped reads).
- Each column represents an individual biological sample.
- Sample naming conventions:
- O = species designation fo sheep (Ovis aries)
- A = adipose tissue samples
- M = muscle tissue samples
- R prefix (e.g., ROA, ROM) indicates replicate or additional samples within the same tissue/group structure
- gene_name: Gene symbol or common gene name annotation
- gene_chr: Chromosome on which the gene is located
- gene_start: Genomic start position of the gene (base pairs)
- gene_end: Genomic end position of the gene (base pairs)
- gene_strand: DNA strand orientation (+ or −)
- gene_length: Length of the gene in base pairs
- gene_biotype: Gene classification (e.g., protein-coding, lncRNA, pseudogene)
- gene_description: Functional annotation or descriptive summary of the gene
- tf_family: Transcription factor family classification (if applicable; blank if not a transcription factor)
Beef
readcount_genename-2.csv - Raw read counts with gene annotations
Column Definitions:
- gene_id: Unique gene identifier (Ensembl gene ID where applicable)
- BM* columns (e.g., BM9108, BM9159, BM9034, etc.):
- Sample-level raw read counts representing the number of sequencing reads mapped to each gene.
- Each column corresponds to an individual muscle tissue sample from beef cattle.
- Unit: raw read counts (integer values)
- BA* columns (e.g., BA9041, BA9034, BA9032, etc.):
- Sample-level raw read counts for individual adipose tissue samples from beef cattle.
- Unit: raw read counts (integer values)
- gene_name: Gene symbol or common gene name annotation
- gene_chr: Chromosome on which the gene is located
- gene_start: Genomic start position of the gene (base pairs)
- gene_end: Genomic end position of the gene (base pairs)
- gene_strand: DNA strand orientation (+ or −)
- gene_length: Length of the gene in base pairs
- gene_biotype: Gene classification (e.g., protein-coding, lncRNA, pseudogene)
- gene_description: Functional annotation or descriptive summary of the gene
- tf_family: Transcription factor family classification (if applicable; blank if not a transcription factor)
Notes:
- These raw count data were used as input for differential gene expression analyses described in the DEG result files.
readcount-2.csv - raw gene expression count matrix without gene annotations
Column Definitions:
- gene_id: Unique gene identifier (Ensembl gene ID where applicable)
- BM* columns (e.g., BM9108, BM9159, BM9034, etc.):
- Sample-level raw read counts representing the number of sequencing reads mapped to each gene.
- Each column corresponds to an individual muscle tissue sample from beef cattle.
- Unit: raw read counts (integer values)
- BA* columns (e.g., BA9041, BA9034, BA9032, etc.):
- Sample-level raw read counts representing the number of sequencing reads mapped to each gene.
- Each column corresponds to an individual adipose tissue sample from beef cattle.
- Unit: raw read counts (integer values)
Notes:
- This file contains raw count data only (no gene annotation fields).
- Annotated versions of these data, including gene metadata (chromosome location, gene description, etc.), are provided in the file
readcount_genename-2.csv - These raw count data were used as input for downstream normalization and differential gene expression analyses.
fpkm_sample-2.csv - sample-level gene expression matrix
Column Definitions:
- gene_id: Unique gene identifier (Ensembl gene ID where applicable)
- BM* columns (e.g., BM9108, BM9159, BM9034, etc.):
- Sample-level normalized gene expression values (unit: FPKM – Fragments Per Kilobase of transcript per Million mapped reads).
- Each column represents an individual muscle tissue sample from beef cattle.
- BA* columns (e.g., BA9091, BA9210, BA9147, BA9034, etc.):
- Sample-level normalized gene expression values (unit: FPKM).
- Each column represents an individual adipose tissue sample from beef cattle.
fpkm_group-2.csv - gene expression with group-level summaries and individual samples
Column Definitions:
- gene_id: Unique gene identifier (Ensembl gene ID where applicable)
Group-level columns:
- Stan_Mus: Mean normalized expression (unit: FPKM) for muscle tissue samples in the "Standard" carcass grade group
- Stan_Adp: Mean normalized expression (FPKM) for adipose tissue samples in the "Standard" group
- Sel_Mus: Mean normalized expression (FPKM) for muscle tissue samples in the "Select" group
- Sel_Adp: Mean normalized expression (FPKM) for adipose tissue samples in the "Select" group
- Cho_Mus: Mean normalized expression (FPKM) for muscle tissue samples in the "Choice" group
- Cho_Adp: Mean normalized expression (FPKM) for adipose tissue samples in the "Choice" group
Individual sample columns (e.g., BM9108, BA9041, BA9034, etc.):
- Sample-level normalized gene expression values (unit: FPKM – Fragments Per Kilobase of transcript per Million mapped reads).
- Each column represents an individual biological sample.
- Sample naming conventions:
- BM = muscle tissue samples from beef cattle
- BA = adipose tissue samples from beef cattle
- gene_name: Gene symbol or common gene name annotation
- gene_chr: Chromosome on which the gene is located
- gene_start: Genomic start position of the gene (base pairs)
- gene_end: Genomic end position of the gene (base pairs)
- gene_strand: DNA strand orientation (+ or −)
- gene_length: Length of the gene in base pairs
- gene_biotype: Gene classification (e.g., protein-coding, lncRNA, pseudogene)
- gene_description: Functional annotation or descriptive summary of the gene
- tf_family: Transcription factor family classification (if applicable; blank if not a transcription factor)
fpkm_genename-2.csv - gene expression matrix with annotations
These files mirror the sheep dataset and provide equivalent expression data for beef samples.
Column Definitions:
- gene_id: Unique gene identifier (Ensembl gene ID where applicable)
- BM* / BA* columns (e.g., BM9108, BA9156, etc.):
- Sample-level normalized gene expression values (unit: FPKM – Fragments Per Kilobase of transcript per Million mapped reads).
- Each column represents an individual biological sample.
- Sample IDs beginning with:
- BM = muscle tissue samples from beef cattle
- BA = adipose tissue samples from beef cattle
- gene_name: Gene symbol or common gene name annotation
- gene_chr: Chromosome on which the gene is located
- gene_start: Genomic start position of the gene (base pairs)
- gene_end: Genomic end position of the gene (base pairs)
- gene_strand: DNA strand orientation (+ or −)
- gene_length: Length of the gene in base pairs
- gene_biotype: Gene classification (e.g., protein-coding, lncRNA, pseudogene)
- gene_description: Functional annotation or descriptive summary of the gene
- tf_family: Transcription factor family classification (if applicable; blank if not a transcription factor)
Experimental Design Summary
Sheep
- Tissues: Muscle (Mus) and Adipose (Adp)
- Groups: One, Two, Three (fat deposition classes)
- Comparisons:
- Two vs One
- Three vs Two
- Three vs One
Beef
- Tissues: Muscle and Adipose
- Groups: Standard (Stan), Select (Sel), Choice (Cho)
- Comparisons:
- Select vs Standard
- Choice vs Standard
- Choice vs Select
Data Processing
- RNA-seq reads generated using Illumina sequencing platforms
- Reads aligned to reference genome
- Gene-level counts obtained using featureCounts
- Normalization performed using FPKM
- Differential expression analysis conducted using standard RNA-seq statistical pipelines
- Significance threshold: adjusted p-value (FDR) < 0.05
Notes
- Missing values indicate genes not present in a given comparison dataset
- Column names were modified to ensure compatibility with Dryad validation
- Sheep and beef datasets are structured consistently
Contact
For questions regarding this dataset, please contact:
Jennifer M. Thomson - jennifer.thomson@montana.edu
This study investigated the molecular basis of meat tenderness in beef and lamb by integrating phenotypic carcass measurements with transcriptomic profiling. Fifteen Angus-cross beef steers and fifteen Columbia-cross lamb wethers were raised under standardized conditions to produce carcasses spanning USDA quality grades (Standard, Select, Choice) and fatness classes, respectively. Longissimus muscle and subcutaneous adipose tissues were collected postmortem from each animal (n = 60 total samples). RNA was extracted from each tissue sample using standard protocols, with RNA integrity confirmed (RIN ≥ 8.0). RNA-seq libraries were prepared using the Illumina TruSeq Stranded mRNA Library Prep Kit and sequenced on an Illumina NovaSeq 6000 platform (150 bp paired-end reads). Raw reads were quality-checked (FastQC), trimmed (Trimmomatic), and aligned to the Bos taurus ARS-UCD1.2 or Ovis aries Oar_v4.0 reference genomes using STAR. Gene-level expression was quantified with featureCounts and normalized as transcripts per million (TPM). Differential gene expression analysis was conducted using DESeq2, with significance thresholds of FDR-adjusted p < 0.05 and |log₂ fold-change| ≥ 1.0. Functional enrichment and pathway analyses were performed using DAVID, KEGG, and Ingenuity Pathway Analysis (IPA), and protein–protein interaction networks were constructed using the STRING database.
