Skip to main content

Data for: Connecting voice profiling to genomics


Singh, Rita (2022), Data for: Connecting voice profiling to genomics, Dryad, Dataset,


This dataset contains a graph-based path-finding algorithm to detect chainlink genes for voice chains that connect the genes in chromosomal microdeletion regions to the FOXP2 gene. Instructions are within multiple README files included with the package. The code is set up to be used on Linux platforms.


This code package accompanies the manuscript titled: Connecting human voice profiling to genomics: A predictive algorithm for linking speech phenotypes to genetic microdeletion syndromes

Data and code summary

Two publicly available datasets were used in this paper to obtain the results reported in Table 1 in this manuscript. These are enumerated below. 

1.      The ConsensusPathDB (CPDB) biological pathways database

2.      The HGNC human genome database

These databases are not provided in this package due to licensing restrictions (on redistribution of data). However, there is a comprehensive README file that is included in this package, with instructions on how to download these databases and how to subsequently use the code provide to process them. Some sample intermediate derivatives from the databases above are provided for guidance and reference, for the user's convenience.

Following the instructions in the README file will produce the results reported in this paper, in full.

The code provided was written by the author of the corresponding manuscript, and is meant for public use, free of any restrictions whatsoever from the authors’ side. 

In addition, the code files provided in this package are commented in detail (in-line with the code) for the user's convenience.

Usage notes

The programs in this package are written in awk. The scripts that call them are expected to be run on a Linux/Unix platform within a tc-shell environment. A comprehensive README file is included within the code package. It containes instructions to run the code to reproduce the results in the paper referenced. It also contains screenshots of commands and sample outputs for the user's reference.


Army Research Office, Award: W911NF-20-D-0002

U.S. Army Futures Command