Google Research’s “Open Buildings” is an open, large-scale geospatial dataset that maps where buildings are (and what their footprints look like) across much of the Global South. Practically, it gives you machine-generated building outlines (polygons) and centroids (points) derived from satellite imagery, plus a confidence score per detection so you can choose how strict you want to be about “likely buildings” versus “maybe buildings”. It is intended for planning and “social good” applications—especially where authoritative building registers, address systems, or recent cadastral data are incomplete or unavailable. (sites.research.google)
What it is (in plain terms)
Think of Open Buildings as a massive “building layer” you can overlay on a map—similar in spirit to building footprints in OpenStreetMap, but produced at very large scale using deep learning on satellite imagery.
The core Open Buildings dataset (currently Version 3) contains about 1.8 billion building detections over an inference area of roughly 58 million km², spanning Africa, South Asia, Southeast Asia, Latin America and the Caribbean. (sites.research.google)
For each building, the dataset provides:
- A footprint polygon (the outline of the building as a geometry)
- A confidence score (how sure the model is that this detection is a building)
- A Plus Code for the building’s centroid (a short digital location code that can function like an address where street addressing is weak)
- A few derived attributes like area and centroid coordinates (sites.research.google)
Just as importantly, it does not provide:
- building names, land parcel boundaries, ownership, use class, street address, or occupancy type
It’s geometry + confidence + location code—nothing more. (sites.research.google)
What it can do (capabilities)
Because it is a consistent building footprint layer over huge regions, Open Buildings becomes a “base ingredient” you can combine with other datasets. Typical capabilities include:
- Settlement and population analytics
Building footprints are often used as a proxy for where people live, especially when census data is old or spatially coarse. You can estimate settlement extent, relative density, and (with care) support population models. (sites.research.google) - Infrastructure and service planning
Utilities and public agencies can use building density patterns to prioritize electrification, water, sanitation, clinics, schools, and transport interventions—particularly in fast-growing peri-urban areas. (Google highlights real partner use in energy planning, humanitarian sampling, and urbanization analysis.) (sites.research.google) - Humanitarian and disaster response
Overlay floodlines, wildfire perimeters, landslide susceptibility, or earthquake shakemaps with building footprints to estimate exposed structures, affected households, and response logistics. (sites.research.google) - Environmental and climate work
Built-up footprint and settlement density help model human pressure on ecosystems, emissions, energy demand, and land-use change. (sites.research.google) - Digital addressing support
Where formal addressing is limited, building centroids paired with Plus Codes can help structure service delivery databases and field operations. (sites.research.google)
A second “companion” product: the 2.5D Temporal Dataset (change + heights)
Alongside the footprint polygons, Google also publishes the Open Buildings 2.5D Temporal Dataset. This is different in two big ways:
- It is raster data (gridded pixels), not polygons.
- It provides annual snapshots from 2016–2023, including building presence, fractional building counts, and estimated building heights, across roughly the same broad Global South coverage. (sites.research.google)
This is designed for questions like: Where is the city expanding year by year? Where is vertical growth happening? What changed before vs after a disaster? (sites.research.google)
Data structure you should understand before using it
Open Buildings is enormous, so it’s distributed in a tiled way:
- Polygons and points are stored as spatially sharded CSV files, one per S2 cell level 4 tile (an S2 cell is a standard hierarchical grid used in geospatial systems). (sites.research.google)
- Each polygon row includes fields such as: latitude, longitude, area, confidence, WKT geometry (for polygons), and full_plus_code. (sites.research.google)
- There is also a score threshold table that suggests confidence cutoffs per tile to target ~80/85/90% precision—useful because model performance varies by geography and settlement form. (sites.research.google)
Accuracy, limitations, and professional cautions
This dataset is powerful, but it’s not ground truth.
Key limitations Google explicitly notes include:
- False positives and missed buildings (commission and omission errors)
- Challenging contexts: dense contiguous roofs, arid geology that looks “built”, very small structures, and high-rise roof shift effects
- Offset/misalignment between the imagery used for detection and the basemap imagery you might be viewing today
- Freshness varies by location depending on availability of high-resolution source imagery
- Sensitive areas (including some conflict zones) are omitted to protect at-risk populations (sites.research.google)
For professional use, treat Open Buildings as:
- excellent for strategic planning, screening, prioritization, and analysis at scale
- not suitable as a sole source for legal boundaries, compliance, cadastral decisions, or property-level enforcement
Licensing and compliance
Open Buildings v3 is offered under two licenses—you can choose either:
- CC BY 4.0, or
- ODbL v1.0 (useful for compatibility with OpenStreetMap workflows)
Google’s rationale is to let both OSM-aligned and non-OSM users adopt the data without license incompatibility headaches. (sites.research.google)
How to access and use it (practical “How To” paths)
Path A — Quick exploration (no heavy downloading)
Best when you want to see what’s there and do light analysis.
- Get access to Google Earth Engine
Earth Engine is the easiest way to explore this dataset without moving hundreds of gigabytes around. Google’s Earth Engine catalog page for Open Buildings includes dataset IDs and an “Open in Code Editor” link. (Google for Developers) - Load the dataset in Earth Engine
- For polygons, you’ll use the FeatureCollection for Open Buildings v3 polygons. (Google for Developers)
- For temporal change/heights, you’ll use the ImageCollection for the 2.5D Temporal dataset. (Google for Developers)
- Filter by confidence
A common professional workflow is to start conservative (higher confidence) to reduce false positives, then relax thresholds if you are under-detecting in rural/informal contexts. - Overlay with your context layers
Bring in admin boundaries, hazard zones, service areas, or project polygons (AOIs), and compute counts/areas/exposure.
Why this path works: you get planetary-scale processing, and you can export only the subset you need.
Path B — Download for GIS (QGIS/ArcGIS) by country/region
Best when you want to use it as a layer in desktop GIS and your AOI isn’t enormous.
- Choose your area of interest
Decide whether you need polygons (outlines) or points (centroids). Polygons are richer but heavier. - Use the official “download by region” notebook
The Open Buildings site links a Colab notebook showing how to download data for a specific country/region. (sites.research.google) - Understand the file format before you import
Downloads are CSV with geometries stored as WKT (POLYGON/MULTIPOLYGON). (sites.research.google)
In QGIS, you’ll typically:
- import as delimited text (for points) or
- load and convert WKT geometry into a proper vector layer (often via processing tools or conversion scripts), then save as GeoPackage.
- Apply a confidence threshold (recommended)
Use:
- a global threshold you choose (simple), or
- the per-tile recommended thresholds (more rigorous) to target 80–90% precision. (sites.research.google)
- Clip to your AOI and build indexes
Once imported, clip to your project boundary, build spatial indexes, and you’ll have a performant building layer for analysis.
Path C — Bulk / national scale (cloud-first workflow)
Best when you’re doing country-scale analytics, national dashboards, or repeatable pipelines.
- Use Google Cloud Storage + gsutil
Google provides a direct cloud bucket containing the full dataset. The Open Buildings site lists the bucket paths and total sizes (polygons are large—on the order of hundreds of GB). (sites.research.google)
Example commands shown on the site (for full copies):
gsutil cp -R gs://open-buildings-data/v3/polygons_s2_level_4_gzip .
gsutil cp -R gs://open-buildings-data/v3/points_s2_level_4_gzip .
gsutil cp gs://open-buildings-data/v3/score_thresholds_s2_level_4.csv .
- Process “tile-first”
Because the data is sharded by S2 tile, your pipeline should:
- download only the tiles intersecting your AOI,
- filter by confidence,
- convert CSV+WKT into a cloud-native geospatial format (e.g., GeoParquet / FlatGeobuf / GeoPackage depending on your stack),
- then run analytics.
- Join with other national layers
This is where Open Buildings shines: combine it with electrification networks, health facility catchments, flood return period maps, school locations, etc., to produce decision-ready indicators.
When to use the Temporal (2.5D) dataset instead of polygons
Use the 2.5D Temporal dataset when your question is about:
- change over time (annual growth patterns, new development fronts)
- approximate height signals (relative vertical intensity)
- tracking settlement evolution in places where high-res imagery may be stale
It can be explored via an Earth Engine app/script and can also be downloaded from Google Cloud Storage for AOIs and timeframes. (sites.research.google)
Bottom line
Open Buildings is best understood as a foundational “built footprint layer” for the Global South—massive coverage, consistent structure, and designed to be combined with other datasets for planning and analysis. If you adopt it professionally, the winning approach is: start in Earth Engine to validate coverage and confidence behaviour in your region, then export or download only what you need, and always document your thresholds and limitations alongside results. (sites.research.google)
For architects, engineers, and planners in Africa, Open Buildings is more than a dataset—it’s a practical shortcut to “first truth” in contexts where records are fragmented, growth is fast, and the ground reality changes quicker than official maps. A consistent building-footprint layer allows teams to begin projects with a credible picture of settlement extent, density, and development direction, even before detailed surveys are funded. This is especially valuable in peri-urban expansion zones, secondary cities, and informal areas where the biggest risks are not only design risks, but information risks: wrong demand assumptions, mis-sized infrastructure, misaligned routes, and under-planned services.
Used well, Open Buildings becomes a shared reference layer across disciplines. Architects can use it to read urban grain and block morphology, test housing and facility placement, and support precinct briefs with real settlement patterns rather than approximations. Engineers can rapidly estimate service demand proxies (household counts, roof area, density gradients), prioritize network upgrades, and screen corridors for water, sewer, power, and roads against actual built-up footprints. Planners can map growth fronts, identify underserved clusters, compare settlement change over time (especially using the temporal dataset), and build evidence-led phasing strategies for municipal capital investment.
In the African context, the highest value often comes from combining Open Buildings with local knowledge and municipal layers—roads, topography, floodplains, land-use schemes, substations, pipe networks, clinic/school locations—so that decisions become both faster and more defensible. The key professional stance is to treat Open Buildings as an analytical basemap (excellent for prioritisation and early-stage planning), not as a legal cadastre or final authority at property scale. Used with that clarity, it can materially improve feasibility studies, masterplans, infrastructure sizing, climate-risk screening, and post-disaster assessments—helping teams shift from reactive service delivery to proactive, data-backed urban management across African cities and regions.

