Reranking partisan animosity in algorithmic social media feeds alters affective polarization
Data files
Dec 02, 2025 version files 3.61 MB
-
_constants.R
1.46 KB
-
_qvalue_function.R
2.45 KB
-
_utils.R
1.39 KB
-
01_attrition.R
10.45 KB
-
02_balance.R
12 KB
-
03_polarizaton_main.R
7.31 KB
-
04_emotions_main.R
10.34 KB
-
05_att_main.R
6.38 KB
-
06_polarization_hte.R
11.12 KB
-
07_engagement_analysis.R
11.51 KB
-
08a_pate_raking.R
8.23 KB
-
08b_pate_polarizaton.R
8.05 KB
-
08c_pate_emotions.R
7.46 KB
-
09_experiment_stats.R
1.52 KB
-
10_figs_daily_stats.R
3.66 KB
-
baseline_exposure_infeed.csv
71.56 KB
-
daily_stats.csv
5.12 KB
-
emotions_infeed.csv
1.46 MB
-
engagement.csv
640.65 KB
-
pew_survey_proportions.csv
283 B
-
political_infeed.csv
665.94 KB
-
post.csv
89.61 KB
-
pre.csv
320.68 KB
-
raking_user_att.csv
171.12 KB
-
raking_weights.csv
60.11 KB
-
README.md
18.60 KB
-
requirements.md
737 B
Abstract
Today, social media platforms hold the sole power to study the effects of feed ranking algorithms. We developed a platform-independent method that reranks participants' feeds in real-time and used this method to conduct a preregistered 10-day field experiment with 1,256 participants on X during the 2024 U.S. presidential campaign. Our experiment used a large language model to rerank posts that expressed anti-democratic attitudes and partisan animosity (AAPA). Decreasing or increasing AAPA exposure shifted out-party partisan animosity by two points on a 100-point feeling thermometer, with no detectable differences across party lines, providing causal evidence that exposure to AAPA content alters affective polarization. This work establishes a method to study feed algorithms without requiring platform cooperation, enabling independent evaluation of ranking interventions in naturalistic settings.
Contact
If you have any questions, please feel free to reach out to:
- Martin Saveski (msaveski@uw.edu)
- Tiziano Piccardi (piccardi@jhu.edu)
Setup
Before running the scripts:
- Set the working directory to the code folder:
setwd("/path-to-repository/code/") - Configure the paths in
_constants.R - Make sure that you have installed all the packages listed in
requirements.md
Scripts summary
01_attrition.R: Attrition analysis02_balance.R: Covariate balance analysis03_polarizaton_main.R: Affective polarization analysis04_emotions_main.R: Emotions analysis05_att_main.R: Political attitudes analysis06_polarization_hte.R: Affective polarization, heterogeneous treatment effects analysis07_engagement_analysis.R: Engagement analysis08a_pate_raking.R: Population Average Treatment Effects (PATE), initial step, raking08b_pate_polarizaton.R: PATE analysis of the affective polarization outcomes08c_pate_emotions.R: PATE analysis of the emotions outcomes09_experiment_stats.R: Experiment summary statistics10_figs_daily_stats.R: Figures demonstrating the change in exposure due to the intervention_constants.R: Common variables used across scripts_qvalue_function.R: False discovery rate (FDR) correction for multiple hypothesis testing_utils.R: Utility functions
Data description
pre.csv
Participants' responses to the pre-experiment survey.
| Column | Description |
|---|---|
user_id |
Anonymized participant ID. |
experiment |
Experiment arm (R: Reduced Exposure; I: Increased Exposure). |
condition |
Treatment arm (T: Treatment; C: Control). |
platform |
Platform from which the participant was recruited (CloudResearch or Bovitz). |
gender |
Self-reported gender. |
age |
Age in years. Excluded in public release due to the Dryad's requirements |
race_white |
Race/Ethnicity indicator, White. |
race_black |
Race/Ethnicity indicator, Black. |
race_hispanic |
Race/Ethnicity indicator, Hispanic/Latino. |
race_asian |
Race/Ethnicity indicator, Asian. |
race_other |
Race/Ethnicity indicator, Other. |
education |
Education level. |
ladder |
Subjective socioeconomic ladder. |
income |
Household income. |
party |
Party identification. |
party_strength |
Strength of party identification (_1_weak: selected "Not very strong" or initially reported being Independent; _2_strong: selected "Strong"). |
party_identity |
Self-reported importance of party identity (scale: 0–100). |
fth_outparty_pre |
Response to the out-party feeling thermometer (0–100). |
fth_inparty_pre |
Response to the in-party feeling thermometer (0–100). |
att_suc_pre |
Support for antidemocratic candidates (scale: 0–100; index average of 4 questions). |
att_ada_pre |
Support for antidemocratic practices (scale: 0–100; index average of 4 questions). |
att_spv_pre |
Support for partisan violence (scale: 0–100; index average of 4 questions). |
att_bepf_pre |
Biased evaluation of politicized facts (scale: 0–100; index average of 4 questions). |
att_sup_bip_pre |
Opposition for bipartisanship (scale: 0–100; index average of 2 questions). |
att_soc_dis_pre |
Social distance (scale: 0–100; index average of 2 questions). |
att_soc_tru_pre |
Social trust (scale: 0–100; single question). |
emo_enthusiastic_pre |
Rating of feeling "Enthusiastic" (scale: 1–5). |
emo_happy_pre |
Rating of feeling "Happy" (scale: 1–5). |
emo_still_pre |
Rating of feeling "Still" (scale: 1–5). |
emo_lonely_pre |
Rating of feeling "Lonely" (scale: 1–5). |
emo_sad_pre |
Rating of feeling "Sad" (scale: 1–5). |
emo_nervous_pre |
Rating of feeling "Nervous" (scale: 1–5). |
emo_satisfied_pre |
Rating of feeling "Satisfied" (scale: 1–5). |
emo_calm_pre |
Rating of feeling "Calm" (scale: 1–5). |
emo_relaxed_pre |
Rating of feeling "Relaxed" (scale: 1–5). |
emo_tired_pre |
Rating of feeling "Tired" (scale: 1–5). |
emo_fearful_pre |
Rating of feeling "Fearful" (scale: 1–5). |
emo_aroused_pre |
Rating of feeling "Aroused" (scale: 1–5). |
emo_excited_pre |
Rating of feeling "Excited" (scale: 1–5). |
emo_bored_pre |
Rating of feeling "Bored" (scale: 1–5). |
emo_angry_pre |
Rating of feeling "Angry" (scale: 1–5). |
post.csv
Participants' responses to the post-experiment survey.
| Column | Description |
|---|---|
user_id |
Anonymized participant ID. |
fth_outparty_post |
Response to the out-party feeling thermometer (scale: 0–100). |
att_suc_post |
Support for antidemocratic candidates (scale: 0–100; index average of 4 questions). |
att_ada_post |
Support for antidemocratic practices (scale: 0–100; index average of 4 questions). |
att_spv_post |
Support for partisan violence (scale: 0–100; index average of 4 questions). |
att_bepf_post |
Biased evaluation of politicized facts (scale: 0–100; index average of 4 questions). |
att_sup_bip_post |
Opposition for bipartisanship (scale: 0–100; index average of 2 questions). |
att_soc_dis_post |
Social distance (scale: 0–100; index average of 2 questions). |
att_soc_tru_post |
Social trust (scale: 0–100; single question). |
emo_sad_post |
Rating of feeling "Sad" (scale: 1–5). |
emo_calm_post |
Rating of feeling "Calm" (scale: 1–5). |
emo_excited_post |
Rating of feeling "Excited" (scale: 1–5). |
emo_angry_post |
Rating of feeling "Angry" (scale: 1–5). |
political_infeed.csv
| Column | Description |
|---|---|
user_id |
Anonymized participant ID. |
experiment |
Experiment arm (R: Reduced Exposure; I: Increased Exposure). |
condition |
Treatment arm (T: Treatment; C: Control). |
platform |
Platform from which the participant was recruited (CloudResearch or Bovitz). |
baseline |
Mean of all the participants’ in-feed responses to the out-party feeling thermometer question during the 3-day baseline period (imputed from the participant’s pre-survey response if they did not complete any in-feed surveys during the baseline period). |
answer |
Response to the in-feed out-party feeling thermometer question (scale: 0-100). |
emotions_infeed.csv
| Column | Description |
|---|---|
user_id |
Anonymized participant ID. |
experiment |
Experiment arm (R: Reduced Exposure; I: Increased Exposure). |
condition |
Treatment arm (T: Treatment; C: Control). |
emo_name |
Emotion measured in-feed (angry, sad, calm, or excited). |
value |
In-feed survey response (0–100). |
platform |
Platform from which the participant was recruited (CloudResearch or Bovitz). |
baseline |
Mean of all the participants’ responses for the given emotion during the 3-day baseline period (imputed based on the participant's pre-survey response for the same emotion, if they did not complete any in-feed surveys for this emotion during the baseline period). |
engagement.csv
| Column | Description |
|---|---|
user_id |
Anonymized participant ID. |
experiment |
Experiment arm (R: Reduced Exposure; I: Increased Exposure). |
condition |
Treatment arm (T: Treatment; C: Control). |
platform |
Platform from which the participant was recruited (CloudResearch or Bovitz). |
outcome |
Engagement metric, including: |
- n_sessions_daily: Number of sessions per day, |
|
- timespent_daily: Time spent on the platform per day (in seconds), |
|
- n_views: Number of posts viewed, |
|
- n_views_1s: Number of posts viewed for at least one second, |
|
- n_likes: Number of posts liked, |
|
- n_reposts: Number of posts reposted, |
|
- likes_rate: Fraction of posts liked among all viewed posts, |
|
- reposts_rate: Fraction of posts reposted among all viewed posts. |
|
baseline_value |
Value of the outcome measured during the 3-day baseline period. |
experiment_value |
Value of the outcome measured during the 7-day experimental period. |
pew_survey_proportions.csv
| Column | Description |
|---|---|
att |
User attribute, including party ID (Democrat, Republican), education (with or without a college degree), and race (white or non-white). |
value |
Value for the given attribute. |
Freq |
Percentage of X users with the given value for the given attribute, as estimated by the Pew survey. |
raking_user_att.csv
| Column | Description |
|---|---|
user_id |
Anonymized participant ID. |
race_white |
Indicator for whether the participant selected "White". |
party |
Party identification. |
education |
Education level. |
rk_party_id |
Coarsened version of the party variable used for raking. |
rk_education |
Coarsened version of the education variable used for raking. |
rk_race_white |
Coarsened version of the race_white variable used for raking. |
raking_weights.csv
| Column | Description |
|---|---|
user_id |
Anonymized participant ID. |
experiment |
Experiment arm (R: Reduced Exposure; I: Increased Exposure). |
rk_wt |
Raking weight. |
baseline_exposure_infeed.csv
| Column | Description |
|---|---|
user_id |
Anonymized participant ID. |
b_f_political_tweets |
Fraction of political posts the participant was exposed to during the 3-day baseline period. |
daily_stats.csv
| Column | Description |
|---|---|
experiment |
Experiment arm (R: Reduced Exposure; I: Increased Exposure). |
condition |
Treatment arm (T: Treatment; C: Control). |
day_index |
Study day index (10 days, indexed 0–9; 0–2 baseline, 3–9 intervention period). |
f_politics_mean |
Mean fraction of political posts per participant on the given experiment day. |
f_politics_ci |
95% confidence interval for the mean fraction of political posts per participant on the given experiment day. |
f_aapa_mean |
Mean fraction of AAPA posts per participant on the given experiment day. |
f_aapa_ci |
95% confidence interval for the mean fraction of AAPA posts per participant on the given experiment day. |
m_aapa_score_mean |
Mean AAPA score across all political posts per participant on the given experiment day. |
m_aapa_score_ci |
95% confidence interval for the mean AAPA score across all political posts per participant on the given experiment day. |
Note on the survey questions
The emotion questions (with the prefix emo_ in pre.csv and post.csv) asked participants: "Please indicate how often you felt each of the following emotions while reading social media like Twitter/X last week." Participants could select: "Never" (1), "A small amount of the time" (2), "Half the time" (3), "Most of the time" (4), or "All the time" (5).
The questions related to political attitudes (with the prefix att_ in pre.csv and post.csv) were borrowed from Voelkel et al. (2024). The exact phrasing of the questions is provided in Section S0.3 of their Supplementary Materials.
Note on excluding the age attribute from the public release of the data
To minimize the risk of deanonymization, Dryad requires that at most three demographic attributes per participant be included in the public release of the data. Accordingly, we excluded the age attribute from the public release, originally included in the pre.csv. As a result, the following scripts will not exactly replicate some of the results reported in the manuscript: 01_attrition.R, 02_balance.R, and 06_polarization_hte.R. We noted which analyses these are in the comments of the scripts and included the results of running the scripts with and without the age attribute. The substantive conclusions of all analyses remain the same.
Human subjects data
We take the original ID, concatenate it with a secret salt string, and hash the resulting string. Hashing ensures that the original IDs can’t be easily recovered, and adding the salt protects against dictionary-based attacks, where an attacker may have a list of Bovitz or CloudResearch IDs to hash and compare against the anonymized ones. We received user consent to publish the data in de-identified form.
The dataset is collected with custom instrumentation through a browser extension, web surveys, and with in-feed surveys added to the participants' feeds on X.
