Discounting future reward in an uncertain world: behavioural data
Data files
May 11, 2023 version files 353.94 KB
-
README.md
-
STUDYDATA.mat
Abstract
Humans discount delayed relative to more immediate reward. A plausible explanation is that impatience arises partly from uncertainty, or risk, implicit in delayed reward. Existing theories of discounting-as-risk focus on a probability that delayed reward will not materialize. By contrast, we examine how uncertainty in the magnitude of delayed reward contributes to delay discounting. We propose a model wherein reward is discounted proportional to the rate of random change in its magnitude across time, termed volatility. We find evidence to support this model across three experiments (total N=158). Firstly, using a task where participants chose when to sell products, whose price dynamics they previously learned, we show discounting increases in line with price volatility. Secondly, we show that this effect pertains over naturalistic delays of up to four months. Using functional magnetic resonance imaging, we observe a volatility-dependent decrease in functional hippocampal-prefrontal coupling during intertemporal choice. Thirdly, we replicate these effects in a larger online sample, finding that volatility discounting within each task correlates with baseline discounting outside of the task. We conclude that delay discounting partly reflects time-dependent uncertainty about reward magnitude, i.e. volatility. Our model captures how discounting adapts to volatility, thereby partly accounting for individual differences in impatience. Our imaging findings suggest a putative mechanism whereby uncertainty reduces prospective simulation of future outcomes.
Methods
Experiment 1
In Experiment 1 participants were briefed to imagine that they owned a farming business, selling produce to the highest bidder in a marketplace. Participants learned how the prices of three different products (wheat, chicken and beans) evolved week-by-week, where a week corresponded to a trial of the experiment (Figure 2). The three products had different levels of volatility in price evolution. Participants subsequently made intertemporal choices about when to sell each product, either immediately for a guaranteed price or in the marketplace following a delay.
Methods
Participant Recruitment and Sample Size
This experiment was designed as a pilot, and thereby focused on testing for larger, within participant, effects. Participants were recruited from the UCL Institute of Cognitive Neuroscience subject database. 20 participants (mean age 27.4 years, s. d. 6.9 years; 9 female) completed the experiment.
Baseline Discounting
Prior to the main task we elicited discount functions for riskless quantities of money. Participants were required to indicate the smallest immediate monetary reward, termed their indifference amount, that they would be willing to accept instead of a larger stated quantity of money (£8, £9, £11 or £12) to be received at a specified delay (1, 2, 4, 26 or 52 weeks). Each delay was presented twice for each larger reward amount, creating 40 choices in total. One choice was selected to be paid for real, at the stated delay, in post-dated Amazon vouchers. To achieve this in an incentive-compatible manner, for the selected choice, we randomly selected an immediate reward from a uniform distribution between £0 and the magnitude of the larger reward (e.g., £12); if this amount was below or equal to the participant’s stated indifference point, they received the delayed reward, if above the indifference point they received the randomly-drawn immediate reward. Participants were fully briefed on this procedure. Three participants who answered £0 in response to all baseline questions were excluded from this analysis.
Learning Price Dynamics
During the task, participants observed and predicted the price of each product, displayed on a linear scale ranging from £0 to £25, as it evolved over the course of 240 trials. Each trial of the experiment was described as a ‘week’. After passively observing prices over several ‘weeks’ (trials), participants were asked to predict upcoming prices one week ahead; the task therefore involved both observational and instrumental learning. Participants were instructed about two sources of variability in prices: Gaussian emission noise, applying equally to all products, which we described as ‘variability in bidding’, and changes in the underlying ‘market price’. For one of the three products (‘No Volatility’) the market price was held constant; the market price of the other two products (‘Low Volatility’ and ‘High Volatility’) underwent random changes across time, with the same Gaussian emission noise. We used two predefined sequences of outcomes for each product; participants were then allocated at random to one of the two sequences. We estimated learning rates for the three products separately by fitting a Rescorla-Wagner learning model (Rescorla & Rescorla, 1967) to participants’ price predictions from the first block of 70 prediction trials.
Intertemporal Choice Procedure
At three points during each block, participants were asked to predict the market price further into the future, at delays of 1, 4, 7, 12 or 18 weeks. Participants subsequently chose when to sell the product, either immediately for a fixed price (x), or on the market after a stated delay (1, 4, 7, 12 or 18 weeks). Specifically, they were asked to indicate the smallest fixed price that would just tempt them away from selling on the market. Participants were informed that the future price would evolve according to the same process they had previously observed, and was also subject to the same Gaussian emission noise. By contrast, the immediate price was fixed, with no objective risk.
Participants were informed that, after the experiment, we would select one of their choices to be paid out for real. To realise this in an incentive-compatible manner, for the selected choice, we randomly selected an immediate fixed price from a uniform distribution between £0 and £25; if this amount was below the participant’s stated indifference point, they received the simulated future market price for the product as a bonus payment. If the selected price was above the participant’s indifference point they received the randomly-drawn fixed price. All bonus payments were made on the same day, at the end of the experiment. Trial Structure of Learning Phase
For a ‘No Volatility’ product the market price was held constant. The market price of the other two products (‘Low Volatility’ and ‘High Volatility’) underwent random changes across time. Price trajectories for these two products were simulated by implementing a time-dependent probability that the market price would change to a new value, selected from a uniform distribution between specified bounds. For a ‘Low Volatility’ product, changes in the market price were small, while for a ‘High Volatility’ product, changes were more extreme.
Within each block, participants performed three phases of observation and prediction: the first consisted of 70 observation trials followed by 70 prediction trials, while the subsequent two phases each consisted of 45 observation trials and 5 prediction trials. After each phase the price evolution was paused whilst participants made a set of intertemporal choices. Learning rates were fitted based on the first 70 prediction trials; subsequent prediction phases were included to ensure that participants attended to prices before making intertemporal choices.
Experiment 2
Experiment 2 tested whether the effects observed in Experiment 1 replicated in a larger sample, and also probed neural correlates of volatility discounting. Here, to test whether effects of volatility extend to timescales used in conventional discounting tasks, we superimposed the timescale of the task onto longer delays. Specifically, one actual intertemporal choice was selected to be paid out at the stated delay, in the order of weeks. To further test the veridicality of the model, we measured risk aversion outside the main task, and elicited participants’ subjective estimates of future uncertainty within-task.
Methods
Learning Phase
Participants learned price dynamics according to a similar procedure as described for Experiment 1. Here only two products were used, to simplify the neuroimaging analysis. For one of the two products (‘Stable’) the market price was held constant at £25, and participants were explicitly informed about this; the market price of the other product (‘Volatile’) evolved according to a Gaussian random walk, with zero mean drift and volatility σ=3.5, upper bounded at £50 and lower bounded at £0. We used two predefined sequences of outcomes sampled from a random walk with these properties; participants were then allocated at random to one of the two sequences.
Participants first passively observed the price of each product, displayed on a linear scale ranging from £0 to £50, as it evolved over the course of 240 trials. Over a further 240 trials they were asked to predict upcoming prices. Prices for the two products were displayed in randomly ordered mini-blocks of 60 trials in length; at the start of each block the market price was reset to £25. Price predictions followed the same procedure as in Experiment 1. For the Stable product, participants were instructed the future market price would remain constant at £25, whereas for the Volatile product the future market price would drift according to the very same process they had previously observed. In both conditions, future prices were also subject to the same degree of emission noise.
Description of Emission Noise in the Learning Phase
During the learning phase, participants were explicitly instructed about two sources of variability in prices: an irreducible Gaussian noise (?=2) applying equally to both items, which we described as ‘variability in online bidding’, and drift in the underlying ‘market price’. To facilitate this explanation, in a practice phase participants first observed a series of trials in which the market price was displayed as a horizontal bar on the price scale, together with a dot indicating the highest bid. This was shown first for the Stable item, where the market price was constant, to familiarize participants with ‘variability in the bidding’, before they observed a drifting market price for the Volatile item, with the same level of variability in bidding. The horizontal bar denoting the market price was not shown during the experiment itself. Participants were told that their task during the experiment would be to predict the price on subsequent weeks and that the best way to do so is to estimate the underlying market price.
The underlying market price of the Volatile item was not revealed to participants but they were told that the market price would ‘drift up and down over time’, and that they needed to keep track of this. Participants were also instructed that due to emission noise in prices, a given price provided imprecise information about actual market price and to make good predictions they should accumulate information over a number of recent prices.
Intertemporal Choice Phase
After observing price evolution for both products, participants entered a separate intertemporal choice phase of the experiment. Here, participants made a series of binary choices about when to sell each product, either immediately for a guaranteed (i.e. riskless) price (less than £25), or in the marketplace after a stated delay (0, 1, 4, 17 weeks) from a starting price of £25. We included a delay of zero weeks to allow the intercept of the discounting curve to be reliably estimated. We selected a set of guaranteed immediate prices that allowed a plausible range of discount factors to be estimated.
Participants were informed that one of their choices would be selected and realized after the experiment and, depending on their actual choice, participants would be paid either the guaranteed amount on the day of the experiment (if they opted for the immediate choice), or the simulated future price of the product at the stated number of weeks in the future (if they opted for the delayed choice). Here, since delays were real, rather than embedded within the timescale of the task, we expected a significant degree of discounting even in the Stable condition, due to the influence of effects outside of the task.
Participant Recruitment, Sample Size and Power Calculation
We conducted a behavioral pilot experiment with 11 participants, using the above design. In this pilot experiment we observed an effect size of d=0.76, based on the mean difference in log K between Stable and Volatile products, suggesting that a medium effect size was a plausible assumption. Sample size for the imaging experiment was therefore determined so as to achieve at least 80% power to detect a medium effect size (Cohen’s d=0.5), based on a paired t-test, indicating a required sample size of 34 participants. We aimed to recruit at least 34 participants for the main experiment, in addition to the pilot sample. 36 participants were recruited from the UCL Institute of Cognitive Neuroscience subject database, and underwent MRI scanning. Including the behavioral pilot, a total of 47 participants (mean age 28.0 years, s.d. 8.4 years, 32 female) completed the experiment.
Baseline Risk Aversion
Before participants were introduced to the market behavior, we measured their risk preferences for lotteries to be paid out on the same day. Each lottery had prices drawn from a Gaussian distribution with mean £25, and one of four standard deviations (ranging from £5 to £13). For each lottery participants observed 36 outcomes drawn from the relevant distribution, before making a series of choices between receiving a guaranteed amount (between £11 and £25) or accepting a one-off play of the lottery.
Subjective Future Uncertainty
Following the learning phase, we elicited participants’ predictions about each product’s future price, for a scenario in which the current market price was stated to be £25. Using a graphical interface, participants were also asked to indicate the lower and upper bounds of an interval in which they were 90% certain the future price would lie, described as ‘highest and lowest reasonable estimates’.
Experiment 3
Experiment 3 tested for replicability of effects in Experiments 1 and 2 in a larger, online sample. A replication test was motivated by findings that estimated correlation coefficients are unstable in smaller sample sizes (Schönbrodt & Perugini, 2013). The design followed that of Experiment 1.
Methods
Sample Size and Power Calculation
Budget constraints limited our sample size to approximately 100 participants. Statistical simulations indicate that this sample size allows the Pearson correlation coefficient r to be estimated within an interval of +/-0.15 with 80% confidence, for a true correlation of r=0.3 (Schönbrodt & Perugini, 2013). Participants were recruited from Prolific.co, an online subject database.
Baseline Delay Discounting
Prior to starting the task, participants (N=101, mean age 28.9 years, s. d. 9.7 years; 52 female) made a series of binary choices between a monetary reward of magnitude £23, £23.50, £25 or £26.50, delayed by 3, 7, 12 or 18 weeks respectively, and a smaller quantity of money available immediately. Choices were selected according to an adaptive procedure (see Supporting Material). Participants were informed that we would select one choice from every twenty participants to be paid for real as a bonus in Prolific.
Learning Price Dynamics
Market prices evolved according to a Gaussian random walk, upper bounded at £40 and lower bounded at £0. For one of the three products (‘No Volatility’) the market price was held constant at £20.00; the market price of the other two products (‘Low Volatility’ and ‘High Volatility’) evolved with volatility σ=1.5 and σ=3.5 respectively. All prices were subject to Gaussian emission noise, with standard deviation η=3. Price profiles were selected so that the market price on the final trial was equal to the long-run mean of £20.00.
Intertemporal Choice Procedure
After observing the price dynamics of each product, participants were asked to predict each product’s future price, and report a subjective confidence interval, as described for Experiment 2. Eight out of 101 participants did not adjust the subjective confidence interval from its starting value in at least one of the three conditions, suggesting inattention to the task; these participants were therefore excluded. Participants subsequently chose when to sell the product, either immediately for a guaranteed price, or for the market price after a stated delay (1, 4, 9 or 13 weeks). As for Experiment 1, one choice was selected to be realized, and was paid out as a bonus at the end of the experiment. An adaptive procedure was used to estimate indifference points at each delay by adjusting the immediate price.
Usage notes
Matlab (Mathworks, Provo, UT)