Skip to main content
Dryad logo

High pitch sounds small for domestic dogs dataset


Korzeniowska, Anna; Simner, Julia; Root-Gutteridge, Holly; Reby, David (2022), High pitch sounds small for domestic dogs dataset, Dryad, Dataset,


Humans possess intuitive associations linking certain non-redundant features of stimuli – e.g., high-pitched sounds with small object-size (or similarly, low-pitched sounds with large object-size). This phenomenon, known as crossmodal correspondence, has been identified in humans across multiple different senses. There is some evidence that non-human animals also form crossmodal correspondences, but the known examples are mostly limited to the associations between the pitch of vocalisations and the size of callers. To investigate whether domestic dogs, like humans, show abstract pitch-size association, we first trained dogs to approach and touch an object after hearing a sound emanating from it. Subsequently, we repeated the task but presented dogs with two objects differing in size, only one of which was playing a sound. The sound was either high- or low-pitched, thereby creating trials that were either congruent (high-pitch from small object; low-pitch from large objects) or incongruent (the reverse). We found that dogs reacted faster on congruent versus incongruent trials. Moreover, their accuracy was at chance on incongruent trials, but significantly above chance for congruent trials. Our results suggest that non-human animals show abstract pitch-sound correspondences, indicating these correspondences may not be uniquely human but rather a sensory processing feature shared by other species.



Fifty domestic dogs of unrestricted breeds aged between 6 and 132 months (M=56.1, SD=36.6) were recruited through social media advertisements and word of mouth. We removed dogs who failed to complete training (n=8), or who failed to produce any useable testing trials (n=12; e.g., they were too distracted to perform the task). Further 111 trials from the remaining dogs were also removed (e.g., due to background noise, or owner interference; see Electronic Supplementary Material (ESM) for full exclusion criteria). This left 133 trials from 30 dogs.


The training stimulus consisted of a plastic-coated 3-dimensional frustum-shaped object, painted blue to increase visual salience for the dogs. The frustum had a base-diameter of 29cm and a volume of approximately 9000 cm3. A JBL Flip 4 Bluetooth speaker was inserted via a small hole in the side of the object and the remaining cavity filled with insulating rockwool (Knauf Earthwool) to reduce any resonances from the speaker. The sound stimulus for training was a 2-second-long 430Hz (9.33 erb) pure tone created in PRAAT. The sound was played at 86dB (approximately 65dB measured at the dog’s position) which was controlled from a Motorola Razor phone connected to the speaker via Bluetooth. The dogs’ behaviour was recorded using 3 cameras: a Go Pro Hero 7 camera positioned on the right side of the dog; a GoPro Max camera positioned to the front of the dog, and a Sony Handycam camera positioned to the right (or left, depending on the position of the experimenter) and behind the dog. Additionally, there was a Sony Handycam camera, positioned in front of the dog, to the right or left of a partition screen which was hiding the experimenter. This was connected to an Acer LED 21.5” Monitor (KA220HQ) where the experimenter could watch the behaviour of the dog while out of sight.

The testing stimuli were identical to training stimuli in all ways other than the following: There were now two objects per trial, one big and one small (approximately 23000cm3 and 2300cm3 respectively). To ensure that our findings would hold across multiple shapes, we created three sets of objects: a pair of cylinders, a pair of cones, and a pair of cuboids – ensuring each dog saw one set only (e.g., a big and small cuboid). There were now also two sound stimuli, one low pitch (150Hz, 4.26 erb) and one high pitch (900Hz, 14.4 erb). All other aspects of stimuli and recordings were as in the training phase. Remaining materials were a partition screen, button-clicker, reward-tube, and rewards. The experimenter trained the dogs to touch the object using a button clicker and commercially available dog treats (see below). Treats were delivered via the plastic reward-tube attached to the partition screen, whose position and orientation is described below.


Dogs were brought into the testing room at the University of Sussex by their owners and given approximately 5 minutes to acclimatise, while owners signed the consent form and completed a questionnaire about their dogs (stating their breed, age, sex, neuter status, and weight). Then the owners were asked to sit in a chair facing a partition screen, which had a reward-tube attached to it. Each dog was put into a “sit” position directly in front of the owner to prevent the owner from accidentally cueing the dog, and owners were asked not to interact with their dogs during the experiment. Between the dog and the screen was the training object.

When training began, the experimenter used “shaping” with the aid of a clicker to train the dogs to target the training object with their nose or paw. This technique involves waiting for the desired 

behaviour and quickly marking and rewarding when it occurs. This method was chosen to avoid additional prompts or social cues that would later have to be faded, which could prolong the training process. The primary reinforcer (food treat) was delivered through a reward-tube attached to the screen positioned behind the object (facing the dog). The screen itself also served a purpose of allowing the experimenter to hide behind it during testing to avoid influencing the dogs’ behaviour. After a correct response was established (i.e., dog consistently targeting the object and touching with nose or paw), a sound playing from the speaker within the object was now introduced. If the dog moved too soon (before the sound was played) the experimenter prevented the dog from touching the object, by blocking it with her hand or lifting the object up. After each repetition of the behaviour, the dog was returned to the “sit” position in front of the owner and another trial was initiated. As the training progressed, the owner and dog were slowly moved backwards to increase the distance the dog had to travel to touch the object, and the experimenter started to gradually retreat behind the room-dividing screen. Training continued until the owner’s chair was in position by the wall opposite the screen (350cm away from the screen) and the dog was able to sit and wait in an appropriate position for the next sound to be played by the experimenter (now hidden out of view behind the screen). The appropriate position for the dog was to be seated in front of the owner but behind an orange line drawn on the floor. A final 6 trials (object to the left of the dog x2, object to the right of the dog x2, object in the middle x2) had to be completed with 100% accuracy before training ended and the experiment moved on to the testing phase. If this failed, another block of 6 training trials was completed.


Testing took place immediately after successful training was complete, and the dog and owner remained in the position they had reached at the end of training. All elements of testing were identical to training except now two objects were used, again placed in front of the screen, one to the left and one to the right, equidistant from the reward delivery tube, 90cm in front of the screen, 124 cm away from each other, and 150cm away from the dog. The shape of the objects was randomly assigned to each dog (i.e., a pair of cylinders or cones or cuboids). Only one of the objects played a sound on each trial. Dogs were presented with 8 trials in total, crossing the three variables shown in Table 1. This crossing effectively means every trial presented two objects (big and small) one of which was playing a sound (high or low pitched), with objects switching on the left or right of the space (e.g., Trial 1 big-on-left; Trial 2 big-on-right). The presentation of two objects and two sounds was necessary because the pitch-size correspondence appears to be relative in humans, rather than absolute. In other words, we hypothesised that dogs must see two objects to know that one is ‘big’ and must hear two sounds to know that one is ‘high’. As with training, each trial began with the experimenter positioning the stimuli (i.e., moving the small/big objects left/right, according to the trial type) and setting the dog up in a sitting position in front of (and facing away from) its owner. The experimenter then retreated behind the screen, from where she could manipulate the sound media. She then activated the sound, which was the signal for the dog to approach and touch the sound-making object. When the dog touched the object, the experimenter marked the behaviour with a click from the clicker and deposited a small treat via the tube as a reward for the dog. The dog was then repositioned by the experimenter in preparation for the next trial. If the dog did not respond at all within 40s from start of the sound stimulus, the sound was repeated. If the dog continued to not respond, the trial was recorded as “no response”, and the next trial began. If the dog moved out of position, but made no choice (e.g., went to investigate an area of the testing room where there was no object) this was also recorded as “no response”. If the dog showed signs of stress such as excessive panting, pacing, comfort-seeking from owner, excessive whimpering, or avoidance of the experimenter, the experiment was terminated.


Of each dog’s eight trials, four were congruent (small object playing high pitch sound; or big object playing low pitch sound; once with small object on the left and once with small object on the right). The remaining four trials were incongruent (i.e., small/low on left and right, and big/high on left and right). The combination of trials was pseudo-randomly assigned to each dog, to avoid any ordering effect.

Video coding

Video recordings of dog behaviour were edited in iMovie (Apple Inc.) to replace the sound stimulus for each trial with a ringing tone to allow the behaviour of the dogs to be coded blind. Subsequently, the recordings were analysed using SportsCode Gamebreaker version 10 by ATK, and 25% of trials by HRG.