Skip to main content

Yards, block groups, and vegetation cover measures

Cite this dataset

Locke, Dexter; Ossola, Alessandro; Minor, Emily; Lin, Brenda (2021). Yards, block groups, and vegetation cover measures [Dataset]. Dryad.


Residential yards are a significant component of urban socio-ecological systems; residential land covers 11% of the United States and is often the dominant land use within urban areas. Residential yards also play an important role in the sustainability of urban socio-ecological systems, affecting biogeochemical cycles, water, and the climate via individual- and household-level behaviors. Vegetation, such as trees and grasses, are unevenly distributed across front and back yards, and we sought to understand how similar yards are to each other when compared to their neighboring yards and neighborhoods using aerial imagery. There are many ways to measure yard similarity, and we compared several measures to account for different definitions of ‘neighborness’. We examined the spatial autocorrelation of several yard vegetation characteristics in both front and backyards in Boston, MA, USA. Our study area included 1,027 Census block groups (sub-neighborhood areas) and 175,576 parcels with matched front-backyard pairings (n = 351,152 yards in total) across Boston’s metropolitan area. This data package contains 1) 351,152 yard spatially-referenced yard polygons with five measures of vegetation summarized, 2) the containing block groups, and 3) and *.R script that replicates the analyses reported in Locke, D. H., Ossola, A., Minor, E., & Lin, B. B. (2021). Spatial contagion structures urban vegetation from parcel to landscape. People and Nature, 00, 1–15.


1. Study Area
This study focused on the Boston, MA, metropolitan region (42°21′29″N 71°03′49″W), an area of approximately 703 km2. The region has a humid continental climate (mean annual temperature = 9.6 °C; mean annual precipitation = 1233 mm) (PRISM Climate Group 2015) and was historically covered with mesic forests. Forty-four percent of the land area is residential (Ossola et al., 2019a), which is consistent with other urban areas in western countries such as Baltimore, MD (Avolio et al., 2020), Chicago, IL (Lewis et al., 2019), Adelaide, (Australia)(Ossola et al., 2021), Edinburgh (Scotland), Belfast (Northern Ireland), Cardiff (Wales), and Leicester and Oxford (England) (Loram et al., 2007), and represents more than twice as much land area as parks and open spaces (18.43%) (Ossola et al., 2019b). Backyards compose 14% of all urban land area and contain ~21% of all tree canopy cover; front yards cover ~8% of the area and have ~8% of the study area’s tree canopy cover (Ossola et al., 2019b).

2. Open Data
Classified LiDAR point cloud data (year 2014) were obtained from the US Geological Survey (“MA Post-Sandy CMPG 2013–14”, NPS = 0.7 m, vertical and horizontal accuracy = 0.05 m and 0.35 m, respectively). High-resolution RBG-NIR imagery (1 m ground resolution, year 2014) were obtained from the National Agriculture Imagery Program (NAIP, USDA). Residential parcel polygons, building footprints, and road centerline data were downloaded from the open data portals of the Commonwealth of Massachusetts (2017) and the City of Boston (2017).

3. Geospatial analyses
All front, corner, and backyards contained in all residential parcels with a house were located and classified in ArcGIS Desktop 10.5 (ESRI, Redlands, CA) by using the workflow described in Ossola and others (2019a, 2019b). Briefly, each house centroid was identified to fit an offset line perpendicular to the closest street centerline. Front and backyards were then located by splitting each parcel polygon with a dividing segment, perpendicular to the offset line, passing through the house centroid, and extending to the parcel’s border. Yards were classified by attributing the front yard as the closest unit to the respective road centerline. Corner yards, which lack clear front/back sides, were assigned to all parcels located within 15 m from street intersections and were excluded from analyses. The workflow used to locate and classify yards exceeded 98% accuracy (Ossola et al., 2019a). Vegetation maps detailing tree height, canopy volume, and tree and grass covers were modelled and validated for their accuracy based on the LiDAR and RBG-NIR imagery as detailed in previous papers (Ossola et al., 2019a, 2019b). Briefly, tree canopy height was extracted from a canopy height model (1.5 m ground resolution) interpolated from the LiDAR data in ArcGIS Desktop 10.5 (ESRI, Redlands, CA). Tree and grass covers were modelled at 1.5 m resolution by using maximum likelihood supervised classification of ~100,000 pixels manually attributed to one of three land cover classes (i.e., tree, grass and non-vegetated cover), and based on the tree canopy height map and the RGB-NIR imagery (Singh et al., 2012). The average vertical accuracy of the tree height data, as recorded by the LiDAR point cloud, is 5.3 cm. The accuracy of the grass and tree canopy cover classification is 91.7% and 98.9%, respectively (Ossola & Hopton, 2018a). Canopy volume was calculated as the product of tree canopy cover and height within each pixel, assuming this volume to be completely occupied by vegetation (Ossola & Hopton, 2018a, 2018b), which overestimates total volume. Because these remotely sensed data view the earth from above, and tree canopy overhangs turf, the turf estimates are plausibly underestimates (Akbari et al., 2003). 


Akbari, H., Rose, L. S., & Taha, H. (2003). Analyzing the land cover of an urban environment using high-resolution orthophotos. Landscape and Urban Planning, 63(1), 1–14.
Avolio, M. L., Blanchette, A., Sonti, N. F., & Locke, D. H. (2020). Time Is Not Money: Income Is More Important Than Lifestage for Explaining Patterns of Residential Yard Plant Community Structure and Diversity in Baltimore. Frontiers in Ecology and Evolution, 8(April), 1–14.
Lewis, A. D., Bouman, M. J., Winter, A. M., Hasle, E. A., Stotz, D. F., Johnston, M. K., Klinger, K. R., Rosenthal, A., & Czarnecki, C. A. (2019). Does nature need cities? Pollinators reveal a role for cities in wildlife conservation. Frontiers in Ecology and Evolution, 7(JUN), 1–8.
Loram, A., Tratalos, J., Warren, P. H., & Gaston, K. J. (2007). Urban domestic gardens (X): The extent & structure of the resource in five major cities. Landscape Ecology, 22(4), 601–615.
Ossola, A., & Hopton, M. E. (2018a). Climate differentiates forest structure across a residential macrosystem. Science of the Total Environment, 639, 1164–1174.
Ossola, A., & Hopton, M. E. (2018b). Measuring urban tree loss dynamics across residential landscapes. Science of The Total Environment, 612, 940–949.
Ossola, A., Jenerette, G. D., McGrath, A., Chow, W., Hughes, L., & Leishman, M. R. (2021). Small vegetated patches greatly reduce urban surface temperature during a summer heatwave in Adelaide, Australia. Landscape and Urban Planning, 209.
Ossola, A., Locke, D. H., Lin, B., & Minor, E. (2019a). Greening in style: Urban form, architecture and the structure of front and backyard vegetation. Landscape and Urban Planning, 185(November 2018), 141–157.
Ossola, A., Locke, D. H., Lin, B., & Minor, E. S. (2019b). Yards increase forest connectivity in urban landscapes. Landscape Ecology, 7(12).
Singh, K. K., Vogler, J. B., Shoemaker, D. A., & Meentemeyer, R. K. (2012). LiDAR-Landsat data fusion for large-area assessment of urban land cover: Balancing spatial resolution, data volume and mapping accuracy. ISPRS Journal of Photogrammetry and Remote Sensing, 74(November), 110–121.

Usage notes

This package includes 1) a shapefile of yard polygons, and 2) an R markdown:

  1. bf_chm.shp is a polygon layer (n=360,846) containing yards' morphological characteristics. Each of the fields are described below.
  2. Boston_residential_autocorrelation.Rmd the R Markdown file used to perform all of the analyses. Additional exploratory work, and analyses not included in the paper can also be performed with that file and all of the paper contents' can be replicated. 

   YARD_ID is a unique identifier for each yard
   YARD describes whether a yard is in the "front" of "back" of a house in each parcel
   YARD_AREA is the yard area (m2) in each parcel [part (ie within each YARD_ID)]

   BACK_AREA is the back-yard area (m2)
   FRONT_AREA is the front-yard area (m2)

   canBACK is the percent canopy cover in the back-yard
   canFRONT is the percent canopy cover in the front-yard
   volBACK it the vegetation volume on a per area basis (m3/m2) in the back-yard
   volFRONT it the vegetation volume on a per area basis (m3/m2) in the back-yard
   AhgtBACK is the Average woody vegetation height in the back-yard
   AhgtFRONT is the Average woody vegetation height in the front-yard
   MhgtBACK is the Maximum woody vegetation height in the back-yard
   MhgtFRONT is the Maximum woody vegetation height in the front-yard
   turfBACK is the percent turf cover in the back-yard
   turfFRONT is the percent turf cover in the front-yard

   PARCEL_ID is a unique identifier for each parcel
   PARCELAREA is the total parcel area (m2)
   T_YARDAREA is the yard area (m2) in each parcel
   BUILD_AREA is the area of all building footprints within a parcel (m2)

   TYPEHH is the type of residential household (1, 2 or 3 families) from parcel data

   Offset is the perpendicular distance of the house centroid from the closest road centerline
   Downtown is the distance from the house centroid to the City of Boston Town Hall Building (downtown)
   POINT_X is the horizontal coordinate (NAD_1983_2011_UTM_Zone_19N, WKID: 6348 Authority: EPSG)
   POINY_Y is the vertical coordinate (NAD_1983_2011_UTM_Zone_19N, WKID: 6348 Authority: EPSG)

   ID_CBG is a unique identifier for Census Block Groups
   NAME is the name of Census Block Groups
   THHBASE is the base number of households used in ESRI's Tapestry classification system of block groups (
   TADULTBASE is the number of adults used in ESRI's Tapestry data
   Shape_Leng polygon perimeter length
   Shape_Le_1 polygon perimeter length, redundant
   Shape_Area polygon area
   TSEGNUM Tapestry Segment Number (
   TSEGCODE Tapestry Segment Code
   TSEGNAME Tapestry Segment Name
   TLIFECODE Tapestry LifeMode Code
   TLIFENAME Tapestry LifeMode Name
   TURBZCODE Tapestry Urbanization Code
   TURBZNAME Tapestry Urbanization Name
   MEDHINC_FY Median Household Income from American Community Survey 2014 5-year estimate
   sample used for randomly subsetting and cartography


National Science Foundation, Award: DBI-1639145