Data from: Dynamics and biases of online attention: the case of aircraft crashes

García-Gavilanes R, Tsvetkova M, Yasseri T

Date Published: September 13, 2016



Files in this package

Content in the Dryad Digital Repository is offered "as is." By downloading files, you agree to the Dryad Terms of Service. To the extent possible under law, the authors have waived all copyright and related or neighboring rights to this data. CC0 (opens a new window) Open Data (opens a new window)

Title Dynamics and Biases of Online Attention: The Case of Aircraft Crashes
Downloaded 19 times
Description This is a text file named "Readme.txt" in the dataset folder submitted for the Royal Society Open Science on May,2016. It describes the dataset used for article "Dynamics of Online Attention: The case of Airline Crashes" by Ruth García-Gavilanes, Milena Tsvetkova and Taha Yasseri Version of Paper 2.0 Release ============================================ Dataset description =================== The dataset is about a set of articles classified as aircraft incidents or accidents in English and Spanish Wikipedia, belonging to the categories “Aviation accidents and incidents by country” and “Aviation accidents and incidents by year” which theoretically covers all airline accidents and incidents in different countries and throughout history available in Wikipedia by December 2015. In total we obtained 1496 articles in English Wikipedia and 488 articles in Spanish Wikipedia. File description =================== flights_en.txt and flights_es.txt have the following columns: ================================================================ These files contain information about the articles and the events associated with them. The files have the following columns flight: Name of the article in Wikipedia (be aware that Wikipedia can redirect these articles to other names) langs: The number of languages containing a version of the article flight.en/ The translated name in English (for flight_es.txt) or Spanish (for flights_en.txt) date: The date the “event” occurred The date the Wikipedia page was first edited The date of maximum views in the timeline of the article deaths: Number of deaths derived from the accident or incident longitude/latitude: The longitude and latitude of the accident or incident (when not available we made an approximation) company_long/company_lat: The coordinates of the country where the airline headquarters is located aircraft_company_continent: The continent where the airline headquarters is located aircraft_company_country: The country where the airline headquarters is located region_event: The continent where the event/incident occurred country_event: The country where the event/incident occurred Folder pageviews ================================================================ The folder has two subfolders: enwiki and eswiki. Each subfolder contains subfolders with the title of each article of the dataset in English (1496) and Spanish (488). In each folder, there are files with names in format title_2008_2015.txt, the field *title* is the name of redirects or the current name of each article. For example, the directory pageviews/enwiki/1912_Brooklands_Flanders_Monoplane_crash contains two files: Monoplane_Committee_2008_2015.txt and 1912_Brooklands_Flanders_Monoplane_crash_2008_2015.txt. The file Monoplane_Committee_2008_2015 contains viewership information about the views to Wikipedia article Monoplane_Committee from 2008 to 2015. Each file has the following columns: rd.views: Number of views date: Date (yyyy-mm-dd) Contact and scripts ================================================================ Source code in R to extract page views of articles and redirects : The pageviews are extracted from Data availability starts from Dec-2007. From 2015 there is an API in R devoted to extract pageviews @ruthygarciag
Download (66.91 Mb)
Details View File Details

When using this data, please cite the original publication:

García-Gavilanes R, Tsvetkova M, Yasseri T (2016) Dynamics and biases of online attention: the case of aircraft crashes. Royal Society Open Science 3(10): 160460.

Additionally, please cite the Dryad data package:

García-Gavilanes R, Tsvetkova M, Yasseri T (2016) Data from: Dynamics and biases of online attention: the case of aircraft crashes. Dryad Digital Repository.
Cite | Share
Download the data package citation in the following formats:
   RIS (compatible with EndNote, Reference Manager, ProCite, RefWorks)
   BibTex (compatible with BibDesk, LaTeX)

Search for data

Be part of Dryad

We encourage organizations to: