Skip to main content

Global database of cement production assets and upstream suppliers

Cite this dataset

Tkachenko, Nataliya et al. (2023). Global database of cement production assets and upstream suppliers [Dataset]. Dryad.


Cement producers and their investors are navigating evolving risks and opportunities as the sector’s climate and sustainability implications become more prominent. While many companies now disclose greenhouse gas emissions, the majority offrom carbon-intensive industries appear to delegate emissions to less efficient suppliers. Recognizing this, we underscore the necessity for a globally consolidated asset-level dataset, which acknowledges production inputs provenance. Our approach not only consolidates data from established sources like development banks and governments but innovatively integrates the age of plants and the sourcing patterns of raw materials as two foundational variables of the asset-level data. These variables are instrumental in modeling cement production utilization rates, which in turn, critically influence a company’s greenhouse emissions. Our method successfully combines geospatial computer vision and Large Language Modelling techniques to ensure a comprehensive and holistic understanding of global cement production dynamics.

README: Global database of cement production assets and upstream suppliers

Description of the data

This database has been created as part of the GeoAsset programme under the Spatial Finance Initiative (UK Centre for Greening Finance and Investment/University of Oxford) in collaboration with Astraea Inc. For more information about this work, visit

Suggested citation: Tkachenko, N., Tang, K., McCarten, M., Reece, S., Kampmann, D., Hickey, C., Bayaraa, M., Foster, P., Layman, C., Rossi, C., Scott, K., Yoken, D., Christiaen, C. and Caldecott, B. (2023) Global database of cement production assets and upstream suppliers [Dataset]. Dryad.

Please contact to inform us about any errors, omissions or other feedback.

The dataset consists of two files: [1] SFI-Global-Cement-Database-assets.csv, [2] SFI-Global-Cement-Database-suppliers.csv

Layout and field descriptions of file [1]:

  1. uid: Unique identifier for the cement plant
  2. city: City in which the plant is located
  3. state: State or province in which the plant is located
  4. country: Country in which the plant is located
  5. iso3: Three-letter country code defined in ISO 3166-1 alpha 3
  6. country_code: Three-digit country code defined in ISO 3166-1 numeric
  7. region: Region in which the plant is located
  8. sub_region: Subregion in which the plant is located
  9. latitude: Latitude for the geolocation of the plant (based on WGS84 (EPSG:4326))
  10. longitude: Longitude for the geolocation of the plant (based on WGS84 (EPSG:4326))
  11. accuracy: The accuracy of the latitude and longitude
  12. status: Current plant operating status
  13. plant_type: The type of cement plant (Integrated or Grinding)
  14. production_type: The production process used to produce the clinker at Integrated plants (Wet or Dry)
  15. confdnc: Accuracy of production capacity (in cases where numerous values are reported)
  16. capacity: Total cement production capacity (millions of tons)
  17. capacity_source: Source used to obtain the capacity estimate (news media, company website or company disclosure reports)
  18. year: Year the plant started production
  19. owner_permid: PermID of the primary owner of the plant*
  20. owner_name: Name of the primary owner of the plant
  21. owner_source: Source reporting the ownership link between the plant and owner
  22. parent_permid: PermID of the ultimate parent of the owner of the plant*
  23. parent_name: Name of the ultimate parent of the owner of the plant
  24. ownership_stake: The percentage ownership attributed to the parent company if the plant is a joint venture. If the plant is majority owned by a single parent company then this column will be blank ('n/a')
  25. parent_lei: Legal Entity Identifier (LEI) of the ultimate parent of the owner of the plant
  26. parent_holding_status: The holding status of the ultimate parent (Private or Public)
  27. parent_ticker: The primary ticker for the ultimate parent, if the company is publicly traded
  28. parent_exchange: The primary exchange for the ultimate parent, if the company is publicly traded
  29. parent_permid_2: PermID of the 2nd ultimate parent of the owner of the plant*
  30. parent_name_2: Name of the 2nd ultimate parent of the owner of the plant
  31. ownership_stake_2: The percentage ownership attributed to the 2nd parent company if the plant is a joint venture
  32. parent_lei_2: Legal Entity Identifier (LEI) of the 2nd ultimate parent
  33. parent_holding_status_2: The holding status of the 2nd ultimate parent (Private or Public)
  34. parent_ticker_2: The primary ticker for the 2nd ultimate parent, if the company is publicly traded
  35. parent_exchange_2: The primary exchange for the 2nd ultimate parent, if the company is publicly traded
  36. sourcing: Locally sourced, imported or hybrid supply of input production materials
  37. raw_mtrl: Typology of raw input materials (limestone, clay, gypsum, sand, coal)
  38. clinker: Whether clinker was used as an input material

Layout and field descriptions of file [2]:

  1. uid: Unique identifier for the cement plant
  2. Country in which the plant is located
  3. Country in which facility-supplier (mine) is located
  4. supplier.latitude: Latitude for the geolocation of the facility-supplier (based on WGS84 (EPSG:4326))
  5. supplier.longitude: Latitude for the geolocation of the facility-supplier (based on WGS84 (EPSG:4326))

*PermID is a unique open source identifier for entities, which is provided by Refinitiv. For more details on the open source PermID database please visit

Note #1: the 2nd ultimate parent information is only provided if the plant is a joint venture, otherwise no 2nd ultimate parent information is provided.

Note #2: where it was not possible to identify relevant entries, datacells are saved as 'n/a'.

Sharing/Access information

Link to related publicly accessible dataset:


Children's Investment Fund Foundation