Skip to main content
Free UK property data API Start free →
Back to Blog
Developer & API May 6, 2026 · 8 min read

How to Scrape UK Property Listings (And Why You Probably Shouldn't)

A look at what a Python requests + BeautifulSoup scraper of UK property listings actually produces — and why we recommend using Homedata or another licensed property data provider instead.

Homedata Team · Updated

Homedata does not endorse scraping property portals.

This article exists because thousands of developers Google "how to scrape property listings" every month — and the honest answer is that scraping rarely works in practice and doesn't get you the data you actually need. For any production use case, use a licensed UK property data provider — Homedata, or another professional data provider.

The naive scraper — and why it usually doesn't work

Every developer trying to build a UK property tool eventually writes some version of this:

import requests
from bs4 import BeautifulSoup

url = "https://www.example-portal.co.uk/properties-for-sale/london"
headers = {"User-Agent": "Mozilla/5.0"}

resp = requests.get(url, headers=headers, timeout=10)
soup = BeautifulSoup(resp.text, "html.parser")

for card in soup.select(".property-card"):
    title = card.select_one("h2").get_text(strip=True)
    price = card.select_one(".price").get_text(strip=True)
    addr  = card.select_one("address").get_text(strip=True)
    print(price, "—", addr, "—", title)

On most UK property portals, this code will not return what you expect. Common outcomes from a single requests.get():

  • HTTP 403 / 429 — bot detection blocks the request before HTML is served.
  • An almost-empty HTML shell — listings are JavaScript-rendered, so requests never sees them. resp.text contains a loading spinner and no data.
  • A Cloudflare or DataDome challenge page — your scraper sees a CAPTCHA, not listings.
  • Geo-blocked or rate-limited — works once, fails on call ten.

Even when the basic request succeeds, the HTML structure changes regularly, the selectors break, and you're back to maintenance. This is the easy lesson. The hard one comes next.

What you actually get back

Look closely at a typical scraped listing. Here's what it really contains:

  • A marketing address — "Quiet residential street, SW18" or "Off Lavender Hill". Not a real, postable, normalised address.
  • A rounded asking price — sometimes "Offers in Excess of", sometimes "Guide Price", sometimes "POA".
  • Bed/bath counts — fine, but already available in any open listing feed.
  • Estate agent name and listing photos — both copyrighted, both off-limits for redistribution.

What's not there:

  • The UPRN — the only stable identifier that links a property to Land Registry, EPC, planning, council tax, or flood-risk data. Portals strip UPRNs deliberately.
  • The full real address with house number, flat, and full postcode. You see the marketing string, not the property.
  • The last sold price, EPC rating, tenure, floor area, listing history, or whether this property has been listed before under a different agent.

In other words: even if your scraper survives the WAF, the IP rotation, and the legal exposure, the data you extract is marketing copy. It's not a property record. It can't be matched, joined, deduped, or trusted.

Why even "successful" scrapers fail at scale

Problem What it costs you
IP bans / CAPTCHAsNeed residential proxies (~£200–£2k/mo), still get blocked.
HTML structure changesSelectors break weekly. Permanent maintenance burden.
JavaScript-rendered listingsForces headless browsers — 50× slower, 100× more expensive.
No UPRN, no real addressCan't join to any official dataset. Marketing strings only.
No historyRe-listings, agent changes, price reductions are invisible.
Portal ToSEvery major UK portal forbids automated extraction; Homedata doesn't endorse working around that.

The data you literally cannot get without a licensed source

This is the part most scraping tutorials skip. The data that actually moves the needle in a property product — UPRN-matched listings, full normalised addresses, EPC, Land Registry sold price chains, listing-event history (Added → Reduced → Sold STC → Withdrawn), tenure, and floor area — is not exposed on portal pages. It's licensed. The portals don't sell it directly. The Land Registry sells partial pieces of it. Open-data projects don't have it.

There are essentially two ways to get it: pay a portal a substantial fee for a private feed, or use a licensed UK property data API.

The Homedata alternative — one call, real data

Homedata is a UK property data API covering 29 million properties. Every listing is matched to a real UPRN, every address is normalised, every price has its history, and every property carries EPC, environmental risk, and last-sold-price context. No scraping, no proxies, no CAPTCHAs, no legal exposure.

import requests

API_KEY = "your_homedata_api_key"

# Real listings, real UPRNs, real addresses — one call
resp = requests.get(
    "https://api.homedata.co.uk/api/listings/search/",
    headers={"Authorization": f"Bearer {API_KEY}"},
    params={"postcode": "SW18", "type": "for-sale", "limit": 50}
)

for listing in resp.json()["results"]:
    print(
        listing["uprn"],
        listing["address_full"],
        listing["price"],
        listing["epc_current"],
        listing["last_sold_price"],
        listing["last_sold_date"],
    )

Sample response (a single listing, abridged):

{
  "uprn": 100023336956,
  "address_full": "10 LAVENDER HILL, LONDON, SW18 4AA",
  "postcode": "SW18 4AA",
  "price": 875000,
  "price_qualifier": "Guide Price",
  "bedrooms": 3,
  "property_type": "Terraced",
  "tenure": "Freehold",
  "epc_current": "C",
  "epc_potential": "B",
  "floor_area_m2": 98,
  "last_sold_price": 612000,
  "last_sold_date": "2018-06-14",
  "listing_history": [
    {"event": "Added",   "date": "2026-04-22"},
    {"event": "Reduced", "date": "2026-04-30", "price": 875000},
    {"event": "Reduced", "date": "2026-04-19", "price": 899950}
  ]
}

What Homedata gives you that no scraper can

  • Real UPRNs — the foundational identifier for every UK property dataset. Not stripped, not derived, not guessed.
  • Normalised addresses — house number, flat, full postcode, BS7666-conformant. Joinable to any OS, ONS, or government dataset.
  • Confirmed matches — every listing is reconciled against Land Registry, EPC, and our property characteristics graph. You're not guessing.
  • Listing history — Added, Reduced, Sold STC, Withdrawn, Re-listed events going back 30 years. Portals show you "today"; we show you the chain.
  • Sold price + EPC + tenure on every record — three calls collapse into one.
  • No IP bans, no proxy budget, no CAPTCHA solving, no legal exposure.

Our recommendation

For any project where the data is going to be used commercially, shown to customers, sold to clients, or put in front of an investor — do not scrape. Use Homedata, or another professional UK property data provider. Pick the one that fits your stack and budget. Any of them will cost you less than a single legal letter, and you'll get genuinely matchable data instead of marketing strings.

Scraping a few public pages to learn HTTP and HTML for a personal project is a different conversation — but at the point where the output leaves your laptop, you should be on a licensed feed.

Try it

The free tier covers 100 calls/month, no credit card. Most hobby projects fit comfortably under it.


FAQ

Does Homedata recommend scraping property portals?

No. Homedata does not endorse scraping. The output is unstable, the data is incomplete (no UPRN, no real address, no joined records), and licensed providers exist for a fraction of what scraping infrastructure costs to build and maintain.

Can I get UPRNs by scraping property listings?

No. Portals deliberately strip UPRNs from public listings. Without a UPRN you cannot reliably match a listing to Land Registry, EPC, planning, or council tax records. UPRN-resolved listings are only available through licensed providers.

What's the difference between a marketing address and a real address?

A marketing address is a string an estate agent writes — "Lovely Victorian terrace, SW18", "Off Northcote Road", "POA". A real address is the BS7666-normalised representation with house number, building name, postcode, and a stable UPRN. You can join real addresses to government data; you cannot join marketing strings.

What is the alternative to scraping property listings?

Use a licensed UK property data provider. Homedata gives you UPRN-matched listings, real addresses, EPC, sold price history, and full listing event history through a single REST call. Several other professional providers cover similar ground — pick whichever fits your stack and budget, but don't scrape.