Investigating the implications of the MC4R gene variants on health disparities related to caloric intake and body composition in African American women
Data files
Dec 22, 2025 version files 122.78 KB
-
Raw_Data_1__Subjects_001_050_Rev_Dryad.csv
81.89 KB
-
Raw_Data_1__Subjects_001_050_Rev_Dryad.xlsx
31.96 KB
-
README.md
8.93 KB
Abstract
This study aimed to assess the frequency distribution of Melanocortin 4 Receptor (MC4R) gene variants and their relationship to caloric intake and body composition in African American women, to understand how these factors influence eating habits and body composition. Fifty adults aged 18-70 from the Southeastern United States participated in this cross-sectional study. MC4R variants were identified using targeted sequencing on an Illumina platform, with genomic DNA (gDNA) isolated from buccal cells. Caloric intake was estimated using Nutritionist-Pro software based on responses to the modified Diet History Questionnaire II (DHQ II), and body composition was measured with the BOD POD instrument.
Dataset DOI: 10.5061/dryad.n02v6wxbb
Description of the data and file structure
This study aimed to assess the frequency distribution of Melanocortin 4 Receptor (MC4R) gene variants and their relationship to caloric intake and body composition in African American women, to understand how these factors influence eating habits and body composition. Fifty adults aged 18-70 from the Southeastern United States participated in this cross-sectional study. MC4R variants were identified using targeted sequencing on an Illumina platform, with genomic DNA (gDNA) isolated from buccal cells. Caloric intake was estimated using Nutritionist-Pro software based on responses to the modified Diet History Questionnaire II (DHQ II), and body composition was measured with the BOD POD instrument.
Files and variables
File: Raw_Data_1__Subjects_001_050_Rev_Dryad.xlsx
Description: This dataset contains de-identified human-targeted next-generation sequencing data on variants of the melanocortin 4 receptor (MC4R) gene. Fifty adults aged 18-70 from the Southeastern United States participated in this cross-sectional study. MC4R variants were identified using targeted sequencing on an Illumina platform, with genomic DNA (gDNA) isolated from buccal cells.
File: Raw_Data_1__Subjects_001_050_Rev_Dryad.csv
Description: This is the same data in .csv format.
Data Coding Definitions
- Null (Variant Not Detected): Indicates that the genetic variant of interest was actively assessed and not detected in the subject’s genomic data using the specified analytical methods. This reflects a true negative finding, not missing information.
- Not Applicable: Indicates that the subject does not carry the relevant gene or genomic region in which the variant would occur; therefore, assessment of the variant is biologically or analytically inapplicable for this individual.
- Not Available (Data Unavailable): Indicates that information regarding the genetic variant could not be retrieved from the referenced databases or data sources at the time of analysis. This reflects missing or inaccessible data, not biological absence.
Variables
- Subject: Three-digit participant code
- Priority: predicted MC4R variant of clinical/biological significance
- Quality: base quality scores and variant caller confidence metrics
- Location: mutation location of the variant within the MC4R gene
- Mut_type: Single Nucleotide Polymorphism
- Chr: Chromosome in which the variant is located
- Start: start position of the mutation
- End: end position of the mutation
- Ref: base type of the reference genome
- Alt: base type mutation
- Vcf_mut: mutation following the Variant Call Format (VCF) allele notation
- GT: genotype
- AD: Allelic Depth
- DP: Read Depth
- GQ: Genotype quality value
- FRE: mutation frequency
- Func.refGene: functional classification of the variant
- Gene.refGene: gene symbol
- GeneDetail.refGene: gene-level annotation where the mutation is located
- ExonicFunc.refGene: mutation function type at the protein code level
- AAChange.refGene: amino acid changes resulting from the variant
- Func.HGVS: variant description according to HGVS rule
- AAChange.HGVS: amino acid changes according to HGVS rule
- cytoBand: Cytogenetic band location of the variant on the chromosome
- InterVar: Clinical interpretation of the variant using the InterVar tool
- ACMG: ACMG/AMP evidence codes assigned to the variant
- wgRna: Annotation indicating if the variant overlaps a known RNA element
- genomicSuperDups: indicate if the variant falls within a repeat segment
- rmsk: identify repetitive DNA elements overlapping the variant
- gwasCatalog: Shows whether the variant was reported in prior GWAS studies
- avsnp150: dbSNP identifier (build 150) assigned to the variant
- 1000g2015aug_all: Allele frequency of the variant in the total population from the 1000 Genomes Project (August 2015 release)
- 1000g2015aug_eas: Allele frequency of the variant in the East Asian population from the 1000 Genomes Project (August 2015 release)
- esp6500siv2_all: Allele frequency of the variant in the NHLBI Exome Sequencing Project (ESP6500, SI-V2)
- ExAC_ALL: Alternate allele frequency of the variant in the Exome Aggregation Consortium (ExAC) database across all populations
- ExAC_EAS: Alternate allele frequency of the variant in the East Asian population within the ExAC database
- gnomAD_exome_AF_raw: Raw alternate allele frequency of the variant in the Genome Aggregation Database (gnomAD) exome dataset across all populations
- gnomAD_exome_AF_eas: Alternate allele frequency of the variant in the East Asian population within the gnomAD exome dataset
- gnomAD_genome_AF_raw: Raw alternate allele frequency of the variant in the gnomAD whole-genome dataset across all populations
- gnomAD_genome_AF_eas: Alternate allele frequency of the variant in the East Asian population within the gnomAD whole-genome dataset
- ICGC_Id: Identifier assigned to the variant in the ICGC database
- ICGC_Occurrence: Reported occurrence of the variant across cancer studies within the ICGC database
- iGeneTechDB: Annotation indicating whether the variant is recorded in the iGeneTech database
- CLNALLELEID: ClinVar allele identifier corresponding to the variant
- CLNDN: ClinVar disease name(s) associated with the variant
- CLNDISDB: ClinVar disease annotation database
- CLNREVSTAT: ClinVar review status for the variant
- CLNSIG: ClinVar clinical significance of the variant
- cosmic95: Presence of the variant in COSMIC database
- HGMD: presence of the variant in the Human Gene Mutation Database
- dbscSNV_ADA_SCORE: predicting the influence of the variant on alternative splicing using AdaBoost algorithm
- dbscSNV_RF_SCORE: predicting the influence of the variant on alternative splicing using Random Forest algorithm
- SIFT_score: indicates the effect of the mutation on the protein
- SIFT_pred: prediction indicating the mutation affects multiple proteins
- SIFT4G_score: Another version of SIFT score
- SIFT4G_pred: Another version of SIFT pred
- Polyphen2_HDIV_score: PolyPhen2 predicts the score of the impact of the mutation on the protein sequence
- Polyphen2_HDIV_pred: The prediction results of the Polyphen2 based on HDIV database
- Polyphen2_HVAR_score: PolyPhen2 predicts the score of the impact of the mutation on the protein sequence based on the HumanVar database
- Polyphen2_HVAR_pred: The prediction results of the Polyphen2 based on HVAR database
- LRT_score: The prediction score of LRT software
- LRT_pred: The prediction result of LRT software
- MutationTaster_score: The prediction score of MutationTaster software
- MutationTaster_pred: The prediction result of MutationTaster software
- MutationAssessor_score: The prediction score of MutationAssessor software
- MutationAssessor_pred: The prediction result of MutationAssessor software
- FATHMM_score: The prediction score of FATHMM software
- FATHMM_pred: The prediction result of FATHMM software
- PROVEAN_score: The prediction score of the PROVEAN
- PROVEAN_pred: The prediction results of the PROVEAN
- VEST4_score: The score of variant effect scoring tool
- MetaSVM_score: prediction scores of SIFT, PolyPhen and MutationAssessor to predict the harmfulness of mutations
- MetaSVM_pred: prediction scores of SIFT, PolyPhen and MutationAssessor to predict the harmfulness of mutations
- ClinPred_score: predicts the harmfulness of non-synonymous mutations
- ClinPred_pred: the prediction result of ClinPred
- CADD_raw: score the harmfulness of SNV and InDel
- CADD_phred: PHRED-scaled CADD score
- fathmm-MKL_coding_score: Functional annotation for coding variants using the FATHMM-MKL algorithm
- fathmm-MKL_coding_pred: Functional annotation for coding variants using the FATHMM-MKL algorithm
- GERP++_NR: measures evolutionary constraint
- GERP++_RS: Rejected substitution score from GERP++; measures evolutionary constraint
- OMIM_Inheritance: Annotation to the Human Mendelian Inheritance Disease Database
- OMIM_Disease: Disease name from OMIM associated with the variant
- HGMD_Disease: Disease name(s) from HGMD associated with the variant
- GO: Gene Ontology terms associated with the gene containing the variant
- KEGG_pathway: KEGG_pathway annotation results
Code/software
Microsoft Excel
Access information
Other publicly accessible locations of the data:
- n/a
Data was derived from the following sources:
- n/a
Human subjects data
Participants signed an informed consent form, and each received a unique code. All direct and indirect identifiers were removed or generalized to prevent re-identification.
