Skip to main content
Dryad logo

Long-distance dependencies in birdsong syntax


Searcy, William; Soha, Jill; Peters, Susan; Nowicki, Stephen (2022), Long-distance dependencies in birdsong syntax, Dryad, Dataset,


Songbird syntax is generally thought to be simple, lacking in particular long-distance dependencies in which one element affects choice of another occurring considerably later in the sequence. Here we test for long-distance dependencies in the sequences of songs produced by song sparrows (Melospiza melodia). Song sparrows sing with eventual variety, repeating each song type in a consecutive series termed a “bout.” We show that in switching between song types, song sparrows follow a “cycling rule,” cycling through their repertoires in close to the minimum possible number of bouts. Song sparrows do not cycle in a set order but rather vary the order of song types from cycle to cycle. Cycling in a variable order strongly implies long-distance dependencies: choice of the next type must depend on the song types sung over the past cycle, in the range of 9-10 bouts. Song sparrows also follow a “bout length rule,” whereby the number of repetitions of a song type in a bout is positively associated with the length of the interval until that type recurs. This rule requires even longer-distance dependencies that cross one another; such dependencies are characteristic of more complex levels of syntax than previously attributed to non-human animals.


The main data set was collected in southwestern Crawford County, Pennsylvania, U.S.A. (41.6°N x 80.4°W) during May and June of 2019. Adult male song sparrows were recorded on their territories. Each subject was recorded on two mornings (6:00-11:00 AM) with a mean of 9.5 days (range 5 – 14) between recording sessions. Recordings were made using digital recorders (Marantz PMD 660 or 670) and cardioid microphones (Shure SM58) in parabolic reflectors (Sony PBR-330) at a sampling rate of 44.1 or 48 kHz. 

The data are from 21 males that we judged to be sufficiently recorded. We attempted to obtain 300 songs in each recording session, as previous work with song sparrows has shown that a sample of this size virtually always captures the complete repertoire of a song sparrow especially if the recordings are continuous. All our recordings were continuous, so that we could document the sequence in which song types were sung. For the 21 males included in the 2019 data set, we obtained a mean of 345 songs (range 295-500) for the first recording session and 316 songs (range 288-383) for the second. We retained one male in the analysis whose recorded songs for the first session (at 295) fell slightly below our initial criterion of 300 and two males whose recorded songs (both at 288) fell slightly below criterion for their second sessions. In all cases, all the song types recorded in the first session were recorded in the second session and vice versa, so all recording sessions were adequate to capture full repertoires and one or more full cycles.

To obtain more in-depth data on short-term repertoire usage we recorded five additional males for all daylight hours in a 24-hour period using an Autonomous Recording Unit or ARU (Song Meter SM4, Wildlife Acoustics). These recordings were made in SW Crawford County between May 14 and 23, 2021, each from the middle of one morning to the middle of the next. The five males were first recorded in person as above to allow identification of each subject’s repertoire of song types (see electronic supplementary material).

We assigned recorded songs to song types using spectrograms made with Audacity software using a 256-point FFT and a Hanning window. We classified two songs as the same song type if they shared the same introductory phrase and half or more of all phrases [29]. Spectrograms of one or more renditions of each song type were printed (using Raven Pro software) to aid in classification. In previous work in this study population, observers blindly classifying songs to song types agreed on the correct classification in 97.7% of cases.

Usage Notes

Missing values occur in Sheet 7 "Jaccard similarities" and are marked n/a. These values are missing because the did not complete a second cycle so a similarity value could not be calculated between cycles 1 and 2.

In Sheet 8 "2021 ARU song sequences," "null" is used to indicate a gap in a song sequence due to no song having been recorded for more than 15 minutes.