Data from: Learning the sound inventory of a complex vocal skill via an intrinsic reward
Abstract
Reinforcement learning (RL) is thought to underlie the acquisition of vocal skills like birdsong and speech, where sounding like one’s “tutor” is rewarding. But what RL strategy generates the rich sound inventories for song or speech? We find that the standard actor-critic model of birdsong learning fails to explain juvenile zebra finches’ efficient learning of multiple syllables. But when we replace a single actor with multiple independent actors that jointly maximize a common intrinsic reward, then birds’ empirical learning trajectories are accurately reproduced. Importantly, the influence of each actor (syllable) on the magnitude of global reward is competitively determined by its acoustic similarity to target syllables. This leads to each actor matching the target it is closest to, and occasionally, to the competitive exclusion of an actor from the learning process (i.e., the learned song). We propose that a competitive-cooperative multi-actor (MARL) algorithm is key for the efficient learning of the action inventory of a complex skill.
README: Learning the sound inventory of a complex vocal skill via an intrinsic reward
https://doi.org/10.5061/dryad.3r2280gpp
This dataset contains data obtained from juvenile zebra finch pitch-shift learning experiments, in addition to MATLAB code containing analyses, modelling and model inference codes.
Description of the data and file structure
All data files are in MATLAB's .mat format and includes:
Song data files containing data for birds trained on matching:
- one new target that is shifted up or down by 2 semitones: .\data\semitones2\targets1
- two new targets that are shifted up and down by 2 semitones: .\data\semitones2\targets2
- one new target that is shifted up or down by by 4 semitones: .\data\semitones4\targets1
Each of these three folders include multiple subfolders, one for each bird (8, 8, 6, respectively), each containing one file data.mat that stores for every bird (data.BirdID) pitch values and their associated timestamps of the syllables (data.Manip{i}.P and data.Manip{i}.T) and calls (data.Calls{i}.P and data.Calls{i}.T — in some files, P and T are named Pfull and Tfull, respectively). The file also stores additional quantities relating to different analyses. This includes:
- The number of experimental days (data.Manip{i}.NumDays)
- For each syllable, the index of the timestamp at the beginning of each experimental day (data.Manip{i}.DayStart)
- The source and target pitch values of the tutor syllables in Herz (data.Manip{i}.source and data.Manip{i}.target) and semitones (data.Manip{i}.sourceST and data.Manip{i}.targetST)
One of these bird subfolders (.\data\semitones4\targets1\R2812) also contains complete pitch values, amplitudes, and timestamps of the bird's call necessary to generate the stack plot in Figure 5D. Each of these three quantities are stored in a single .mat file with a descriptive file name.
Model fitting outputs (maximum likelihood estimation of RL model parameters from 2-semitone 1-target data) are contained in the files:
- .\data\paramsML_alpha_beta_fixed.mat : RL model parameter fit and associated log-likelihood with parameters alpha and beta fixed (also stored for each syllable individually in data.Manip{i}.GNDpMultSigBcpSTab and data.Manip{i}.GNDAvgLLMultSigBcpSTab).
- .\data\paramsML_beta_fixed.mat : RL model parameter fit and associated log-likelihood with parameters alpha estimated and beta fixed (also stored for each syllable individually in data.Manip{i}.GNDAvgLLMultSigBcpSTbeta and data.Manip{i}.GNDAvgLLMultSigBcpSTbeta).
- .\data\paramsML_full.mat : RL model parameter fit and associated log-likelihood with parameters alpha and beta estimated (also stored for each syllable individually in data.Manip{i}.GNDpMultSigBcpST and data.Manip{i}.GNDAvgLLMultSigBcpST).
.\data also contains the subfolder .\sim which includes different model simulation files (see Code/Software below). These files are given descriptive names explaining which model is simulated and the songbird dataset the simulation aims to replicate. The relevant quantity in these files is simulated mean pitch trajectories, stored in the variable MUsim or mu.
Code/Software
All code (directory .\code) is written using MATLAB (.m files). This includes:
- Calls_Analysis_2st1tr_Fig2H_S3F.m : Call analysis for 2-semitone, 1-target data
- Calls_Analysis_4st1tr_Fig5FG.m : Call analysis for 4-semitone, 1-target data
- Characterize_Daily_Pitch_Distributions.m
- Compare_Models_BIC_2st1tr_Fig2D_FigS3A_FigS3D.m : BIC comparison between different reward shapes - 2-semitone 1-target model fits
- Compare_Models_RMSE_2st1tr_Fig2F_FigS3B_FigS3C_FigS3E.m : RMSE comparison between different models - 2-semitone 1-target data
- Compare_Models_RMSE_2st2tr_Fig4Bbottom_Fig4D_FigS5D_FigS5E.m : RMSE comparison between different models - 2-semitone 2-target data
- Compare_Models_RMSE_4st1tr_Fig5CI_FigS6D.m : RMSE comparison between different models - 4-semitone 1-target data
- Compare_Reward_functions_Fig2G_FigS6A.m : Comparing reward functions
- DA_Neuron_Tuning_Fig6.m : Model predictions of dopaminergic neuron tuning in VTA
- Distribution_Song_Call_Intervals_2st2tr_FigS4C.m : Distributions of call-song intervals for 2-semitone 2-target birds
- Distribution_Song_Call_Intervals_4st1tr_FigS5E.m : Distributions of call-song intervals for 4-semitone 1-target birds
- Estimate_Parameters_ML_alpha_beta_fixed.m : Estimating RL model parameters for 2-semitone 1-target data - alpha and beta fixed
- Estimate_Parameters_ML_beta_fixed.m : Estimating RL model parameters for 2-semitone 1-target data - alpha estimated and beta fixed
- Estimate_Parameters_ML_full.m : Estimating RL model parameters for 2-semitone 1-target data - full model (alpha and beta estimated)
- Evidence_hierarchy_st2tr2_FigS5B.m : Evidence for priority of syllables over calls
- FindCP4RL_songbird.m : Identifying an initial guess of the time index at which the performance starts switching towards the new target
- LL_alpha_beta_fixed.m : MLE for the alpha-beta-fixed model
- LL_beta_fixed.m : MLE for the beta-fixed model
- LL_full.m : MLE for the full model
- LL_RLearning.m : MLE for the R-Learning model (maximizing reward directly)
- MARL_maxmax_EndOfBout.m : MARL sum-max - hierarchical (syllables before calls) - R computed at the end of a bout or motif (defined by tau)
- MARL_maxmax_PM.m : MARL sum-max - hierarchical (syllables before calls) - R based on current instance P and the mean mu of past instances
- MARL_maxmax_PP.m : MARL sum-max - hierarchical (syllables before calls) - R based on current instance P and the memory of past instances, also P
- MARL_MOT.m : MARL max-over-targets models
- MARL_summax_PM : MARL sum-max - non-hierarchical (syllables and calls equal) - R based on current instance P and the mean mu of past instances
- MARL_summax_PP.m : MARL sum-max - non-hierarchical (syllables and calls equal) - R based on current instance P and the memory of past instances, also P
- multiple_comparison_correction.m
- Pitch_Trajectory_and_Simulation_Density_2st1tr_Fig2BC_FigS2.m : Pitch trajectories for 2-semitone 1-target data, model fitting, and simulations
- Pitch_Trajectory_and_Simulation_Density_2st2tr_Fig4BCEF_FigS5AC.m : Pitch trajectories for 2-semitone 2-target data and simulations
- Pitch_Trajectory_and_Simulation_Density_4st1tr_Fig5DH_FigS6BC.m : Pitch trajectories for 4-semitone 1-target data and simulations
- Pitch_Variance_Analysis_FigS3G.m
- RL_songbird.m : RL model for learning a single syllable
- RL_songbird_Fiete.m : RL model for learning a single syllable based on Fiete et al. 2007
- RL_songbird_RLearning.m : RL model for learning a single syllable by maximizing reward directly
- Simulate_2st1tr_alpha_beta_fixed.m : Simulate RL model for 2-semitone 1-target data - alpha and beta fixed
- Simulate_2st1tr_beta_fixed.m : Simulate RL model for 2-semitone 1-target data - alpha estimated and beta fixed
- Simulate_2st1tr_full.m : Simulate RL model for 2-semitone 1-target data - full model (alpha and beta estimated)
- Simulate_2st1tr_RLearning.m : Simulate RL model that maximises reward directly for 2-semitone 1-target data - alpha estimated and beta fixed
- Simulate_2st2tr_maxmax_EndOfBout.m : Simulating the hierachical summax model, R delivered at the end of motif or bout - 2-semitone 2-target data
- Simulate_2st2tr_maxmax_PM.m : Simulating the hierachical summax model, Rij depends on motor means - 2-semitone 2-target data
- Simulate_2st2tr_maxmax_PP.m : Simulating the hierachical summax model, Rij depends on instance memories - 2-semitone 2-target data
- Simulate_2st2tr_MOT.m : Simulating the max-over-targets MARL model, two of the 2-semitone 2-target birds
- Simulate_2st2tr_RL.m : Simulating the syllable of 2-semitone 2-target data
- Simulate_2st2tr_SOD.m : Simulating the syllable of 2-semitone 2-target data with sequence-order-dependent models
- Simulate_2st2tr_summax_PM.m : Simulating the non-hierachical summax model, Rij depends on motor means - 2-semitone 2-target data
- Simulate_2st2tr_summax_PP.m : Simulating the non-hierachical summax model, Rij depends on instance memories - 2-semitone 2-target data
- Simulate_4st1tr_maxmax_PM.m : Simulating the hierachical summax model, Rij depends on motor means - 4-semitone 1-target data
- Simulate_4st1tr_maxmax_PP.m : Simulating the hierachical summax model, Rij depends on instance memories - 4-semitone 1-target data
- Simulate_4st1tr_RL.m : Simulating the syllable of 4-semitone 1-target data
- Simulate_4st1tr_summax_PM.m : Simulating the non-hierachical summax model, Rij depends on motor means - 4-semitone 1-target data
- Simulate_4st1tr_summax_PP.m : Simulating the non-hierachical summax model, Rij depends on instance memories - 4-semitone 1-target data
- Summary_Data_2st1tr_Fig2B_right.m : Summary data showing learning outcomes in all 2-semitone 1-target birds
Usage Note:
All data is formatted to allow for rerunning the analyses or reconstructing the figures in Toutounji et al. 2024 with limited user input needed. Data files are stored in .mat format and the linked code was generated in MATLAB 2019 or later. Different MATLAB codes are given descriptive names (also including a heading describing the analysis and/or figures they aim to reproduce). For instance, Compare_Models_RMSE_2st2tr_Fig4Bbottom_Fig4D_FigS5D_FigS5E.m reruns the analysis comparing simulation root-mean-square errors (RMSE) of different models replicating 2-semitone, 2-target (2st2tr) data, and reconstructs Figs. 4B, bottom, 4D, S5D, and S5E in Toutounji et al. 2024. To rerun the scripts, the directories .\data and .\code should be placed in the same parent folder. Reproducing some of the visualisations (MATLAB codes 9, 28, 29, and 30 in the list above) requires calling the function shadedErrorBar.m (Copyright (c) 2014, Rob Campbell All rights reserved) available here. The function subfolder should be placed in the directory .\code.
Methods
Part of the experimental data presented here was previously published (Lipkind et al., 2017). Source and target song models were synthetically composed of natural syllables. Harmonic syllables in the source songs were pitch-shifted by 2 or 4 semitones in the target songs using GOLDWAVE v. 5.68. Song feature calculation and clustering of syllables and calls were performed using Sound Analysis Pro. All other analyses (modes, model fitting, statistical analysis, visualization) were performed in MATLAB (Mathworks Inc).