Computational mechanisms underlying latent value updating of unchosen actions
Data files
Aug 31, 2023 version files 3.70 MB

df_raw.csv

df.rdata

README.md
Abstract
Current studies suggest that individuals estimate the value of their choices based on observed feedback. Here, we ask whether individuals also update the value of their unchosen actions, even when the associated feedback remains unknown. One hundred and seventyeight individuals completed a multiarmed bandit task, making choices to gain rewards. We found robust evidence suggesting latent value updating of unchosen actions based on the chosen action’s outcome. Computational modeling results suggested that this effect is mainly explained by a value updating mechanism whereby individuals integrate the outcome history for choosing an option with that of rejecting the alternative. Properties of the deliberation (i.e., duration/difficulty) did not moderate the latent value updating of unchosen actions, suggesting that memory traces generated during deliberation might take a smaller role in this specific phenomenon than previously thought. We discuss the mechanisms facilitating credit assignment to unchosen actions and their implications for human decisionmaking.
README
This README file was generated on 20230831 by Ido BenArtzi.
GENERAL INFORMATION
 Title of Dataset: Computational mechanisms underlying latent value updating of unchosen actions
 Author Information: A. Principal Investigator Contact Information: Name: Nitzan Shahar Institution: Tel Aviv University Email: nitzansh@tauex.tau.ac.il <br> B. Corresponding author Contact Information: Name: Ido BenArtzi Institution: Tel AVIV University Email: idobenartzi@mail.tau.ac.il
 Date of data collection: 2020
 Location of data collection: Online experiment, conducted using the prolific platform
 Information about funding sources that supported the collection of the data: Israel Science Foundation, Grant #2536/20 given to Nitzan Shahar.
DATA & FILE OVERVIEW
 File List: <br> A. df_raw.csv B. preprocessing.R C. df.rdata D. regression.zip E. null_model.R F. null_model.stan G. null_model_loo.stan H. double_updating_model.R I. double_updating_model.stan J. double_updating_model_loo.stan K. double_updating_one_prediction_error_model.R L. double_updating_one_prediction_error_model.stan M. double_updating_one_prediction_error_model_loo.stan N. select_reject_model.R O. select_reject_model.stan P. select_reject_model_loo.stan
 Relationship between files: <br> File A is the raw dataset. File B is the R script used to preprocess the raw dataset. File C is the output of preprocessing the raw dataset. File D includes all hierarchical Bayesian regressions conducted using the "brms" package in R. Files E to P include three files for each of our four compared models. Each model has an R file on the basis on which data simulations were made, one stan file used for parameter estimation, and one loo (leave one out) stan file used for model comparison.
DATASPECIFIC INFORMATION
Task description
Human participants completed an online multiarmed bandit reinforcement learning task where they were asked to choose cards to gain monetary rewards. The task included four cards, and in each trial, the computer randomly selected and offered two for participants to choose from. Each card led to a reward according to an expected value that drifted across the trials (generated using a random walk with a noise of N(0,.03)). The task included two conditions (win vs. loss block) manipulated between four interleaved blocks (whether the first block was win or loss was counterbalanced between participants). In a 'win' block, the only possible outcomes were winning 1 or 0 play dollars, and in the 'loss' condition, the only possible outcomes were losing 0 or 1 play dollars. Each block consisted of different cards.
Give a brief summary of dataset contents, contextualized in experimental procedures and results. Participants were told that they need to do their best to earn as much money as possible. Participants completed four blocks, with 50 trials each and at the end of the experiment were paid a fixed amount (£2.5) plus a bonus (of £1 or £1.5) based on their performance. Further information and trial sequence is described in Figure 1 and SI.
Data Treatment
The first trial on each block, trials with implausibly quick RTs (<200ms), or exceptionally slow RTs (>4000ms) were omitted (1.79% of all trials). Participants with more than 10% excluded trials (21 participants) or higher than 5% noresponse rate (4 participants), in total 25 participants (12.3% of subjects; age mean = 22.8, range 18 to 36; 22 males, 3 females) were excluded altogether. To conduct our main behavioral analysis, we selected a subset of trials in which the previously unoffered card was reoffered, and the previously offered card was not. This resulted in an average of 63.6 trials per participant (SD=6.7), with the number of trials ranging from 46 to 81 across subjects.
Description of the data and file structure: df_raw.csv
 Number of variables: 14
 Number of rows: 40601
 Variable List: <br> *blk  Block number (ranging from 1 to 4) *trl  Trial number (ranging from 0 to 49) *rw  The reward outcome of the current trial (1/0 for loss condition or 0/1 for win condition) *prob1  The current underlying probability of the chosen card to give the positive outcome (i.e., 0 in loss blocks or 1 in win blocks) *prob2  The current underlying probability of the unchosen card to give the positive outcome (i.e., 0 in loss blocks or 1 in win blocks) *rt  Response time in milliseconds. *key  The keyboard response key pressed by the participant. The number 75 refers to pressing the letter 'k' thus choosing the right card, and 83 refers to pressing the letter 's' thus choosing the left card. *frcA  The identity of the fractal card offered on the left side (cards had fractal images drawn on them) out of the possible 4 cards in the block (ranged 03). *frcB  The identity of the fractal card offered on the right side (cards had fractal images drawn on them) out of the possible 4 cards in the block (ranged 03). *ch  The identity of the chosen card (ranged 03) *cond  Refers to whether the current block is a "win" or "loss" block (pos = win, neg=loss) *prolific_id  Anonymous identifier of the participant *subj  General identifier of the participant *bonus  Refers to the amount of extra bonus given to the participant
 Missing data codes: When participants did not respond adequately, "ch" is equal to 1 and the row is filled with NAs.
Description of the data and file structure: df.rdata
This file can be loaded using R as a rdata file. It includes 22 further variables for a total of 36 variables. It excludes any trials described in the data treatment section (see preprocessing script for specific implementation).
 Added variables: *acc  Accuracy estimate calculated based on whether the participant chose the card with the higher probability of giving a reward (1) or not (0). *trial.total  A running counter for the trials of each participant *delta_exp_value  The difference in expected values (i.e., probabilities to give reward) of the two cards. Specifically, chosen minus unchosen. *offer1  The same as frcA *offer2  The same as frcB *choice  The same as ch *unchosen  The identity of the card which was not chosen (ranges from 0 to 3) *reward  The same as rw *subject  The same as subj *delta_exp_value_oneback  The delta_exp_value in the previous trial (notice that this trial could not be shown in this dataframe as it is filtered out) *reoffer_ch  Describes whether the chosen card from the previous trial is reoffered at the current trial *reoffer_unch  Describes whether the unchosen card from the previous trial is reoffered at the current trial *stay_frc_ch  Refers to whether the card (fractal) which was chosen on the previous trial, was chosen again at the current trial *stay_frc_unch  Refers to whether the card which was unchosen on the previous trial, was chosen at the current trial *reward_oneback  Refers to the outcome of the previous trial (notice this trial could not be appearing in this dataset) *acc_oneback  Same as acc, but for the previous trial *prob1_oneback  Same as prob1 (chosen card EV), but for the previous trial *prob2_oneback  Same as prob2 (unchosen card EV), but for the previous trial *rt_oneback  same as rt, but for the previous trial *condition  same as cond *delta_exp_value_oneback_abs  same as "delta_exp_value_oneback", but as an absolute value. *trial_scaled  scaling of the trial number
Sharing/Access information
A replication of the effect described in the current dataset can be found here:
https://osf.io/xyrhe/
Code/Software
The preprocessing.R file was used to generate the df.rdata file from the df_raw.csv file.
The regression.zip file contains R files in which Bayesian regression analyses were conducted using the "brms" R package.
For each model, an .R file exists for data simulation purposes. Two further stan files were used for parameter estimation and model comparison.