Zillow property-level data panel for select California cities – before and after 2020
Data files
Jul 14, 2024 version files 3.48 MB
-
DataCode.zip
3.33 MB
-
DataDryad_DataDescription_Petersen_Zillow.pdf
147.34 KB
-
README.md
1.60 KB
Abstract
Codebooks for analyzing property (house, condo, flat, etc.) listing data for each of the 10 select regions in the bay area megaregion of California, USA (SAN JOSE, MODESTO, FRESNO, TURLOCK, LIVINGSTON, ATWATER, MERCED, MADERA, MARIPOSA, OAKHURST) were obtained from Zillow Inc. on a monthly basis between March 2018 and May 2019 (denoted as the period before 2020) and May 2020 and September 2021 (after 2020). Combined, the total number of observations (unique listed properties) is N = 57,414. For each month, we obtained a set of unique listing identifiers (ZPID) by manually scanning across the entire Zillow.com directory for a given region and property type (“For Sale” and “Rent”). Read the enclosed document DataDryad_DataDescription_Petersen_Zillow.pdf for a description of the data and output of provided supporting code. Contact the corresponding author for the raw property-level data files, which are anonymized [property address and property identifier (ZPID) fields].
Brief summary
Codebooks for analyzing property (house, condo, etc.) listing data for 10 select regions were obtained from Zillow Inc. on a monthly basis between March 2018 and May 2019 (denoted as the period before 2020) and May 2020 and September 2021 (after 2020). For each month, we obtained a set of unique listing identifiers (ZPID) by manually scanning across the entire Zillow.com directory for a given region and property type (“For Sale” and “Rent”). Combined, the total number of observations (unique listed properties) is N = 57,414.
Description of the Data and file structure
All data files are provided in CSV (comma separated value) format. See DataDryad_DataDescription_Petersen.pdf for data file structure details. Contact the corresponding author for the raw property-level data files, which are anonymized [property address and property identifier (ZPID) fields].
Sharing/access Information
Source data locations:
- https://www.zillow.com/
- https://www.zillow.com/howto/api/GetSearchResults.htm
- ** The previous link has been deactivated - but it is archived by the following link https://web.archive.org/web/20200629170042/https://www.zillow.com/howto/api/GetSearchResults.htm
- https://fred.stlouisfed.org/series/MORTGAGE30US
We used the open-access Zillow Inc. GetSearchResults API to sample house data for each ZPID in accordance with daily API call limits. For more information on the API see the official documentation page: https://www.zillow.com/howto/api/GetSearchResults.htm. We anonymized the property address and ZPID fields.
Programs required: Mathematica 11 (or later version) and STATA 13 (or later version). The workflow for executing Mathematica notebooks is simply Shift+Enter to execute commands contained in any given cell; the initial cells upload the data files, and from there the notebook cells should be executed from start to end in linear order.