Measuring semantic memory using associative and dissociative retrieval tasks
Data files
Jan 28, 2024 version files 751.08 KB
Abstract
Recent theoretical advances highlighted the need for novel means of assessing semantic cognition. Here, we introduce the Associative-Dissociative Retrieval Task (ADT), positing a novel way to test inhibitory control over semantic memory retrieval by contrasting the efficacy of associative (automatic) and dissociative (controlled) retrieval on standard set of verbal stimuli. All ADT measures achieved excellent reliability, homogeneity, and short-term temporal stability. Moreover, in-depth stimulus level analyses showed that associating is easier for words evoking few but strong associates, yet such propensity hampers the inhibition. Finally, we provided critical support for the construct validity of the ADT measures, demonstrating reliable correlations with domain-specific measures of semantic memory functioning (semantic fluency and associative combination) but negligible correlations with domain-general capacities (processing speed and working memory). Together, we show that ADT provides simple yet potent and psychometrically sound measures of semantic memory retrieval and offers noteworthy advantages over the currently available assessment methods.
README: Measuring semantic memory using associative and dissociative retrieval tasks
https://doi.org/10.5061/dryad.vdncjsz1f
We provide two sets of data files: 1) Raw data files containing unprocessed data of individual participants on given cognitive tasks; 2) Processed data files directly prepared for statistical analyses conducted in the study. Likewise, we provide all codes used to process and analyze the data.
The data come from behavioural testing conducted on a computer during individual testing sessions in the laboratory.
Description of the data and file structure
All raw data files (except the .zip file "ADT_StimWords") are structured as long-formatted data frames where columns represent individual variables and rows individual responses of each participant. Processed data files are mostly structured in wide data format.
Raw data files:
- NOTE: This file is available only at the alternative repository at the Open Science Framework; url: https://osf.io/z98my/
- Contains unprocessed retrieval latency and error data on Associative-dissociative retrieval task. The data includes both training and testing trials. The variables of interest are coded in the following columns:
- The "Condition" column designates the rule (Associate or Dissociate) directing the participant's response.
- The "Trial" column codes the stimulus word in the Slovak language (note that you can match the Slovak word with the English word by referring to the Appendix in the article).
- The "Response" column codes the response to the stimulus word provided by each participant using the keyboard.
- The "Latecny" column codes the response time to a stimulus (in seconds), i.e., time measured from the onset of the stimulus word to the first keypress of the provided response.
- The "Error" column marks the responses judged as incorrect by two independent raters. This column contains three values: "1" marks the response not following the response rule or general rule (i.e., providing unrelated or related responses on associate and dissociate trials, respectively, or responding with a proper noun). "2" marks the unidentifiable responses due to typing errors. "3" strategy errors designated the exploitation of deliberative strategy in responding (e.g., using the same initial letter or semantic category for cueing three or more successive responses)
- The "Form" column designates the parallel form to which the trial belonged (A, B, or C).
- The "participant" column codes the participants' ID numbers
ACT.raw.txt
- NOTE: This file is available only at the alternative repository at the Open Science Framework; url: https://osf.io/z98my/
- Contains unprocessed retrieval latency and error data on Associative combination task. The data includes both training and testing trials. The variables of interest are coded in the following columns:
- The "Stim_A" and "Stim_B" columns code the presented stimulus word pair in the Slovak language
- The "Response" column codes the response to the stimulus word pair provided by each participant using the keyboard.
- The "Latecny" column codes the response time to a stimulus (in seconds), i.e., time measured from the onset of the stimulus word to the first keypress of the provided response.
- The "Error" column indicating a response judged as erroneous (i.e., not related to both stimulus words) by two independent raters.
- The "Form" column designates the parallel form to which the trial belonged (A, B, or C).
- The "participant" column codes the participants' ID numbers.
VFT.raw.txt
- NOTE: This file is available only at the alternative repository at the Open Science Framework; url: https://osf.io/z98my/
- Contains unprocessed retrieval latency and error data on Semantic verbal fluency task. The data includes both training and testing trials. The variables of interest are coded in the following columns:
- The "Probe_1" column codes the semantic category for which the participant was instructed to name as many exemplars as possible in 90 seconds. The categories were "Animals" ("Zvieratá" in the Slovak language), "Occupations ("Zamestnanie"), and "Tools" ("Náradie"). The category "Clothing" ("Oblečenie") was used as a practice trial.
- The "Response" column codes individuals entries provided by each participant to each category using the keyboard.
- The "Latecny" column codes the inter-response time, i.e., the elapsed time (in seconds) from the confirmation of the previous response to the first keypress of the next response (note: latency of the first response was measured from the onset of the category cue word).
- The "Error_2" column indicating a response judged as erroneous (1 or 0; i.e., repetition or out-of-category word).
- The "Form" column designates the parallel form to which the trial belonged (A, B, or C).
- The "participant" column codes the participants' ID numbers.
PS_A.csv
- Contains unprocessed reaction time and error data on Digit symbol substitution task. The data includes both training and testing trials. The variables of interest are coded in the following columns:
- The "Block" column designates whether a trial was part of Practice or Test block.
- The "key_resp.rt" column codes the reaction time (in seconds) measured from the onset of the stimulus (i.e., symbol) to the press of the response button (i.e., number on a keyboard).
- The "key_resp.corr" columns codes whether the participant responded correctly (1) or incorrectly (0) on a trial.
- The "Form" column designates the parallel form to which the task belonged (A, B, or C).
- The "participant" column codes the participants' ID numbers.
PS_B.csv
- Contains unprocessed reaction time and error data on Letter matching task. The data includes both training and testing trials. The variables of interest are coded in the following columns:
- The "Block" column designates whether a trial was part of Practice or Test block.
- The "Targets" column represents target letters displayed during the trial for which the participant had to identify whether at least one of them appears in the letter string coded in the "String" column
- The "Includes" column codes whether at least one of the letters in the "Targets" column was also displayed within the letter string in the "String" column (1) or not (0)
- The "Response_txt.rt" column codes the reaction time (in seconds) measured from the onset of the stimulus (i.e., letter string and target letters) to the press of the response button (at least one or none of the target letters matches the letter in the string).
- The "Response_txt.corr" columns codes whether the participant responded correctly (1) or incorrectly (0) on a trial.
- The "Form" column designates the parallel form to which the task belonged (A, B, or C).
- The "participant" column codes the participants' ID numbers.
- Contains unprocessed reaction time and error data on Choice reponse time task. The data includes both training and testing trials. The variables of interest are coded in the following columns:
- The "Color" column codes the color in which the arrow stimulus was displayed.
- The "Response_txt.rt" column codes the reaction time (in seconds) measured from the onset of the stimulus (i.e., white or red arrow pointing up, down, left, or right) to the press of the response button.
- The "Response_txt.corr" columns codes whether the participant responded correctly (1 - pressing an arrow key with a corresponding direction for white arrows and with an opposing direction for red arrows) or incorrectly (0) on a trial.
- The "Form" column designates the parallel form to which the task belonged (A, B, or C).
- The "participant" column codes the participants' ID numbers.
- NOTE: The .zip file is available only at the alternative repository at the Open Science Framework; url: https://osf.io/z98my/
- This zip file contains individual text files for each ADT stimulus word. These text files contain four word associations provided by each participant to a given stimulus word. Participants provided these associations via an online form after attending the testing session in the laboratory.
- Each text file contains "ID" and "text". These text files are processed and used in the calculations implemented in the R script "Associative typicality and topology calculations.R".
Processed data files:
- NOTE: This file is available only at the alternative repository at the Open Science Framework; url: https://osf.io/z98my/
- Contains processed response latency data for each condition and task form in long format (i.e., each response for each participants is listed). The latency data were processed by removing erroneous entries, outlier latency data points (3 standard deviations above or below the mean retrieval latency) and winsorizing (10% two-sided trimming) the remaining latency data. Such processed latency data are coded in the "wRT" column.
ADT.agg.txt
- Contains aggregated (averaged) response latency data (in seconds) for each participant across all trials in wide format. The columns represent averaged response latencies for associate ("a"), dissociate ("d"), and inhibition cost("ic") measures across three (1, 2, 3) parallel forms. Note that inhibition cost is the difference between the dissociate and associate response latencies.
ADT.IC.txt
- NOTE: This file is available only at the alternative repository at the Open Science Framework; url: https://osf.io/z98my/
- Contains processed (10% two-sided winsorization) inhibition cost data (column "wIC" in seconds) by participant, stimulus, and form.
NOTE: This file is available only at the alternative repository at the Open Science Framework; url: https://osf.io/z98my/
Contains psycholinguistics data for each ADT stimulus word. The columns code the stimulus word in the Slovak language (Item_SK), word length ("Length" expressed in number of letters), word frequency ("Frequency" expressed as logarithm of word frequency in the Slovak National Corpus database), word polysemy ("Polysemy" expressed as number of distinct meanings of a word found in the lexicon). Columns "Con", "Ima", "Conx", "Val", "Aro" refer to word Concreteness, Imageability, Contextual availability, Emotional valence, and Arousal, respectively. The values on these variables are averaged ratings collected from independent group of participants (i.e., not participating in the study).
Concreteness (1 = Abstract ; 7 = Concrete)
Imaginability (1 = Difficult to imagine ; 7 = Easy to imagine)
Contextual availability (1 = Low availability of context ; 7 = High availability of context)
Emotional valence (1 = negative ; 7 = positive)
Arousal (1 = calm ; 7 = arousing)
Finally, the columns "Vividness" and "Affective_tone" represent composite z-scores acquired from running principal component analysis on psycholinguistics data. Higher vividness score indicates a word which was rated as more concrete, easily imaginable, and contextually available, whereas higher higher affective_tone score indicates word evoking positive affect and low arousal as opposed to words with lower score evoking negative affect and high arousal.
ADT.item(typ&pot).txt
- NOTE: This file is available only at the alternative repository at the Open Science Framework; url: https://osf.io/z98my/
- Contains data related associative properties of each ADT stimulus word together with calculated composite scores for Associative typicality and Associative topology. Note that typicality and topology scores are expressed in standardized z-scores estimated from principal component analysis.
ACT.proc.txt
- Contains aggregated/averaged response latencies for each participant across all trials within individual task forms (ac1, ac2, ac3).
PS.proc.txt
- Contains aggregated data for all participants and processing speed tasks (p1 = digit symbol substitution; p2 = letter matching; p3 = Choice response time). The aggregated data represent a ratio between average (10% winsorized) reaction time and overall accuracy (i.e., percent correct responses).
VFT.proc.txt
- Contains the number of correctly named category exemplars for each semantic category by individual participants (f1 = animals; f2 = occupations; f3 = tools).
WM.proc.txt
- Contains the calculated average working memory span for each participant in each working memory task (w1 = alpha span task; w2 = operation span task; w3 = rotation span task)
Aggregated.data.txt
- Merges all processed datafiles (from ADT, ACT, VFT, WM, and PS tasks) into one dataframe
Sharing/Access information
Data, codes, and methods are also openly available at Open Science Framework repository:
Code/Software
We provide fully anoted R scripts used to process and analyze the data. Please read carefully all the comments included in the respective R scripts to understand each step in the processing and analysis pipeline more clearly. Make sure to start the scripts with empty (cleared) global environment. We list all R scripts with brief description below:
Raw data processing.R
- This script was used to process the raw data related to all cognitive tasks employed in the study.
ADT Internal consistency & Reliability.R
- This script relates to the Part 1 of the study which analyzed internal consistency and short-term temporal stability of the Associative-dissociative task measures (associative retrieval latency, dissociative retrieval latency, and inhibition cost).
Associative typicality and topology calculations.R
- This script was used to process ADT stimulus word association data and to calculate Associative typicality and Associative topology variables for each ADT stimulus word.
Stimulus-level effects.R
- This script includes linear mixed-effect models and correlations reported in Part 2 of the study.
ADT correlations with other tasks.R
- This script includes correlation analyses between ADT measures and other cognitive tasks used in the study reported in Part 3 of the study. Note that this script also contains principal component analyses conducted on cognitive performance data.
Finally, we provide all tasks and parallel forms employed in the current study together with used materials. The tasks were programmed in PsychoPy (version 3.2.4). We recommend to run the tasks on that specific version (standalone PsychoPy 3.2.4) as newer versions may result in errors.
Methods
All datasets were collected via behavioural testing in a laboratory using a computer. Data referenced in electronic supplementary material were collected via online forms. Data processing is described in the manuscript and supplementary material, and detailed in the supplied R scripts. Details for each dataset and script are provided in the README file.
Usage notes
Supplied data are saved in .csv and .txt format. All data can be accessed via freely available software, including R (for scripts to process and analyze the data) or JASP. In case of downloading the individual data files, we recommend placing them on a C: disk, otherwise adjust the corresponding lines (with paths to files) in respective sections of the R script. Individual behavioural tasks used in the current study can also be inspected in a free stand-alone version of PsychoPy software (note that running the tasks used in the current study on PsychoPy versions newer than v3.2.4 may result in errors. Therefore, we recommend running them specifically on version 3.2.4).