ORCID wasn't intended as a massive longitudinal survey of the global population of scientists, but with 3 million profiles and growing, it is becoming just that. So far a quarter of those researchers have voluntarily added personal information to their public ORCID profiles including the years, locations, and descriptions of their education and employment histories. As this voluntary sampling grows, the demographic and migration patterns of the scientific workforce is coming into focus. The biases are also apparent: ORCID users skew young, and certain countries are over- and underrepresented. The code for processing and analyzing the raw profile data are offered here to help researchers explore ORCID, the largest open repository of scientific careers. [NOTE: further context for this data package is provided in Bohannon (2017) 'Restless Minds' at http://dx.doi.org/10.1126/science.356.6339.690].
ORCID public profiles
This file contains 2.8 million public profiles from ORCID in both XML and JSON format. You will need 300 GB of free space to decompress the data and work with it.
public_profiles.tar
IPython Notebook to process ORCID profiles
This IPython Notebook provides code for processing the data from public_profiles.tar into manageable data files for analysis.
Process ORCID profiles.html
Process ORCID profiles.ipynb
ORCID migrations
This file contains all affiliations (education and employment) and associated data. It is part of the output of the IPython Notebook above.
ORCID_migrations_2016_12_16.csv
ORCID migrations by person
This file is an aggregation of affiliation data for each person. It is part of the output of the IPython Notebook above.
ORCID_migrations_2016_12_16_by_person.csv
IPython Notebook to analyze migrations
This IPython Notebook provides analysis of the CSV files above.
ORCID data analysis.html
ORCID data analysis.ipynb
ResearchGate - researcher movements from 2006
This file was provided to Science magazine by ResearchGate. It was described as being based on "over one million of our members who display education and employment affiliation on their profiles" with the data representing "percentages of total movements of researchers from country to country since 2006."
20170110_Science_Researcher movements from 2006.xlsx