Twitter data reveal six distinct environmental personas

Chang, Charlotte 1 ; Armsworth, Paul 2 ; Masuda, Yuta 3

Published May 09, 2022; Updated Oct 12, 2023 on Dryad. https://doi.org/10.5061/dryad.79cnp5ht0

Data files

May 09, 2022 version files 19.83 MB

Oct 12, 2023 version files 19.83 MB

Abstract

Effective digital environmental communication is integral to galvanizing public support for conservation in the age of social media. Environmental advocates require messaging strategies suited to social media platforms, including ways to identify, target, and mobilize distinct audiences. Here, we provide – to the best of our knowledge – the first systematic characterization of environmental personas on social media. Beginning with 1 million environmental nongovernmental organization (NGO) followers on Twitter, of which 500,000 users met data quality criteria, we identified six personas that differ in their expression of 21 environmental issues. General consistency in the proportional composition of personas was detected across 14 countries with sufficiently large samples. Within the US, although the six personas varied in their mean political ideology, we did not observe that the personas split along political party lines. Our results pave the way for environmental advocates – including NGOs, public agencies, and researchers – to use audience segmentation methods like the one discussed here to target and tailor messages to distinct constituencies at speed and scale. This repository contains several tabular files that can be used to query user data from Twitter or reproduce the main results in the main text of the article.

Replication materials documentation for "Twitter data reveal six distinct environmental personas"

This replication code and dataset accompanies the manuscript linked at: https://doi.org/10.1002/fee.2510. I provide a description of the replication datasets below and include a SHA256 checksum that you can use to ensure the integrity of the downloaded file (please execute shasum -a 256 FILENAME in the command line to verify, or use some other utility to find the SHA256 checksum for each file).

Datasets

MeanViewpoints.tsv contains the mean and standard error of the mean (SEM) for the issue viewpoints shown in Figure 1 for the six personas in a "long data" format
- Columns:
  - variable: name of the environmental issue
  - mean: persona-level mean viewpoint value for that issue
  - SEM: standard error of the mean
  - Persona: abbreviated name for the six personas (SMA: Smart alecks, GEN: Generalists, STE: Stewards, CLC: Climate concerned, TEC: Technocrats, RES: Reserved)
- SHA256: 5d540edcb39c8d7a14db315b5eaeed83689021ec43cbe16a1c7eb4467c943098
UserTweetIDs.txt contains one tweet ID per user of the 1+ million users in our sample. These tweet IDs can be "hydrated" and used to find the users sampled in our study.
- TweetID: single column listing one tweet ID per user
- SHA256: fbe1da240a5ab9d9aebac0aabbde247e6eeebfa77c5471cdb6136f45110b1111
EnvironmentalPundits.tsv contains the user names and IDs for the environmental pundits whose timelines were used as the data source to train the probabilistic latent Dirichlet allocation topic model.
- Columns:
  - Screenname: User name (e.g. GretaThunberg, which you can use to navigate to twitter.com/GretaThunberg)
  - ID: User ID
- SHA256: a6a987d934dea75e8ba2329820d6cfe354af0991f2bdbd4746b0f83ad6dafaa3
Persona_PoliticalIdeology.tsv provides the mean political ideology score for the six personas
- Columns:
  - mean: mean political ideology score
  - SEM: standard error of the mean
  - Persona: abbreviated name for the six personas
- SHA256: e568d9737cbd7c0b1b1ce61a6c9c8294f14a62d934446cc0d618ebf091bf1a13
US_geography.tsv shows the state-level ranks for each persona
- Columns:
  - name: State name
  - Persona: abbreviated name for the six personas
  - Rank: Rank for the 50 states (+ Washington DC)
- SHA256: 8ba7e0ca437639656e25a473c4aec281e828e59941af847d8865bf4eddf1371d

Code

Scraper.py provides code that can be used to obtain user information from the UserTweetIDs.txt data file above to reproduce the user set in our analysis.
Plotting.R provides code to reproduce the plots in the main text.

Twitter data reveal six distinct environmental personas

Data files

Abstract

README

Replication materials documentation for "Twitter data reveal six distinct environmental personas"

Datasets

Code

Methods

Usage notes

Works referencing this dataset