Skip to main content
Dryad

Macaques preferentially attend to intermediately surprising information

Cite this dataset

Wu, Shengyi et al. (2022). Macaques preferentially attend to intermediately surprising information [Dataset]. Dryad. https://doi.org/10.6078/D15Q7Q

Abstract

Normative learning theories dictate that we should preferentially attend to informative sources, but only up to the point that our limited learning systems can process their content. Humans, including infants, show this predicted strategic deployment of attention. Here we demonstrate that rhesus monkeys, much like humans, attend to events of moderate surprisingness over both more and less surprising events. They do this in the absence of any specific goal or contingent reward, indicating that the behavioral pattern is spontaneous. We suggest this U-shaped attentional preference represents an evolutionarily preserved strategy for guiding intelligent organisms toward material that is maximally useful for learning.

Methods

How the data were collected:

In this project, we collected gaze data of 5 macaques when they watched sequential visual displays designed to elicit probabilistic expectations using the Eyelink Toolbox and were sampled at 1000 Hz by an infrared eye-monitoring camera system.

Dataset:

  • "csv-combined.csv" is an aggregated dataset that includes one pop-up event per row for all original datasets for each trial. Here are descriptions of each column in the dataset:
    • subj: subject_ID = {"B":104, "C":102,"H":101,"J":103,"K":203}
    • trialtime: start time of current trial in second
    • trial: current trial number (each trial featured one of 80 possible visual-event sequences)(in order)
    • seq current: sequence number (one of 80 sequences)
    • seq_item: current item number in a seq (in order)
    • active_item: pop-up item (active box)
    • pre_active: prior pop-up item (actve box) {-1: "the first active object in the sequence/ no active object before the currently active object in the sequence"}
    • next_active: next pop-up item (active box) {-1: "the last active object in the sequence/ no active object after the currently active object in the sequence"}
    • firstappear: {0: "not first", 1: "first appear in the seq"}
    • looks_blank: csv: total amount of time look at blank space for current event (ms); csv_timestamp: {1: "look blank at timestamp", 0: "not look blank at timestamp"}
    • looks_offscreen: csv: total amount of time look offscreen for current event (ms); csv_timestamp: {1: "look offscreen at timestamp", 0: "not look offscreen at timestamp"}
    • time till target: time spent to first start looking at the target object (ms) {-1: "never look at the target"}
    • looks target: csv: time spent to look at the target object (ms);csv_timestamp: look at the target or not at current timestamp (1 or 0)
    • look1,2,3: time spent look at each object (ms)
    • location 123X, 123Y: location of each box (location of the three boxes for a given sequence were chosen randomly, but remained static throughout the sequence)
    • item123id: pop-up item ID (remained static throughout a sequence)
    • event time: total time spent for the whole event (pop-up and go back) (ms)
    • eyeposX,Y: eye position at current timestamp
  • "csv-surprisal-prob.csv" is an output file from Monkilock_Data_Processing.ipynb. Surprisal values for each event were calculated and added to the "csv-combined.csv". Here are descriptions of each additional column:
    • rt: time till target {-1: "never look at the target"}. In data analysis, we included data that have rt > 0.
    • already_there: {NA: "never look at the target object"}. In data analysis, we included events that are not the first event in a sequence, are not repeats of the previous event, and already_there is not NA. 
    • looks_away: {TRUE: "the subject was looking away from the currently active object at this time point", FALSE: "the subject was not looking away from the currently active object at this time point"}
    • prob: the probability of the occurrence of object
    • surprisal: unigram surprisal value
    • bisurprisal: transitional surprisal value
    • std_surprisal: standardized unigram surprisal value
    • std_bisurprisal: standardized transitional surprisal value
    • binned_surprisal_means: the means of unigram surprisal values binned to three groups of evenly spaced intervals according to surprisal values.
    • binned_bisurprisal_means: the means of transitional surprisal values binned to three groups of evenly spaced intervals according to surprisal values.
  • "csv-surprisal-prob_updated.csv" is a ready-for-analysis dataset generated by Analysis_Code_final.Rmd after standardizing controlled variables, changing data types for categorical variables for analysts, etc.
  • "AllSeq.csv" includes event information of all 80 sequences

Empty Values in Datasets:

  • There is no missing value in the original dataset "csv-combined.csv". Missing values (marked as NA in datasets) happen in columns "prev_active", "next_active", "already_there", "bisurprisal", "std_bisurprisal", "sq_std_bisurprisal" in "csv-surprisal-prob.csv" and "csv-surprisal-prob_updated.csv".
  • NAs in columns "prev_active" and "next_active" mean that the first or the last active object in the sequence/no active object before or after the currently active object in the sequence. When we analyzed the variable "already_there", we eliminated data that their "prev_active" variable is NA.
  • NAs in column "already there" mean that the subject never looks at the target object in the current event. When we analyzed the variable "already there", we eliminated data that their "already_there" variable is NA.
  • Missing values happen in columns "bisurprisal", "std_bisurprisal", "sq_std_bisurprisal" when it is the first event in the sequence and the transitional probability of the event cannot be computed because there's no event happening before in this sequence. When we fitted models for transitional statistics, we eliminated data that their "bisurprisal", "std_bisurprisal", and "sq_std_bisurprisal" are NAs.

Codes:

  • In "Monkilock_Data_Processing.ipynb", we processed raw fixation data of 5 macaques and explored the relationship between their fixation patterns and the "surprisal" of events in each sequence. We computed the following variables which are necessary for further analysis, modeling, and visualizations in this notebook (see above for details):  active_item, pre_active, next_active, firstappear ,looks_blank, looks_offscreen, time till target, looks target, look1,2,3, prob, surprisal, bisurprisal, std_surprisal, std_bisurprisal, binned_surprisal_means, binned_bisurprisal_means.
  • "Analysis_Code_final.Rmd" is the main scripts that we further processed the data, built models, and created visualizations for data. We evaluated the statistical significance of variables using mixed effect linear and logistic regressions with random intercepts. The raw regression models include standardized linear and quadratic surprisal terms as predictors. The controlled regression models include covariate factors, such as whether an object is a repeat, the distance between the current and previous pop up object, trial number. A generalized additive model (GAM) was used to visualize the relationship between the surprisal estimate from the computational model and the behavioral data.
  • "helper-lib.R" includes helper functions used in Analysis_Code_final.Rmd

Usage notes

Please see the README_file.txt for descriptions of datasets and codes.

Funding

Jacobs Foundation

John Templeton Foundation, Award: 61475

National Science Foundation, Award: 2000759