Home/The Deal Engine/Signals & Data
1,277 words · ~6 minShareable

Signals & Data

THE DEAL ENGINE doesn't guess. It reads. Every property in America leaves a paper trail — who owns it, what they paid, what they owe, whether they're behind. Public records expose most of it for free. The edge isn't access; it's knowing which signals predict a sale, at what price, and stacking them into a score.

This doc covers two things: the signal catalog (what we mine and what each thing tells us) and the data ladder (free first, paid only once a market earns it).

See also: the-deal-engine · the-intelligence-layer · the-plays · kpis-and-reports · following-the-money · Research Pipeline


The four questions

Every signal answers one of four questions. Stack the answers and a deal falls out.

The signal stack
01
WHERE to hunt
Market signals — pick growing, cashflowing metros before you ever look at a house.
02
WHAT it is
Property signals — physicals, value, and the rent-vs-price math.
03
WHO'LL sell
Owner/motivation signals — the human reason a deal exists.
04
WHO'LL sell CHEAP
Distress signals — the pressure that prices below market.
Fig. 1 — Each layer narrows the funnel. Alt data sharpens every layer.

MARKET signals — WHERE to hunt

You can't out-underwrite a dying market. Pick the metro first.

Signal What it tells us Read
Job & population growth Demand for housing is rising Up = tailwind
In-migration (USPS change-of-address, Census) People physically moving in Net inflow = pricing power
Rent-to-price ratio How much cashflow a dollar of price buys Higher = better yield
Building-permit supply New units coming online Oversupply caps rent growth
Median income Affordability ceiling & rent durability Rising income = sticky rent
Path-of-progress New transit, rezoning, planned development Buy ahead of the wave
The point: a B-property in an A-trajectory market beats an A-property in a flat one. Market signals come first because they're the cheapest mistake to avoid.

PROPERTY signals — WHAT it is

Once the market's chosen, score the asset.

Signal What it tells us
Assessed vs market value Tax basis vs reality — gap hints at mispricing
Last-sale date & price Cost basis, hold length, likely equity
Beds / baths / sqft Comparability & unit economics
Condition (age, permit history) Reno scope and risk
Rent estimate vs price The cashflow verdict — does it pencil?

OWNER / MOTIVATION signals — WHO'LL sell

A great house isn't a deal until someone wants out. This is where margin lives.

Signal Why it predicts a sale
Absentee / out-of-state owner Tired of remote management; emotionally detached
Length of ownership (long hold) "Tired landlord" — decades in, ready to exit
Estimated equity position High equity = room to discount and still walk happy
Portfolio owner Trades in bulk, thinks in spreadsheets, sells rationally
Owner age Probate-adjacent; estate/lifecycle transitions

DISTRESS signals — WHO'LL sell CHEAP

Pressure prices below market. These are the discount engine.

Signal The pressure
Tax delinquency Owes the county; clock is ticking
Pre-foreclosure / lis pendens Lender has filed; motivated and time-boxed
Code violations Fines stacking; can't or won't fix
Eviction filings Landlord fatigue at a breaking point
Vacancy (USPS vacant flag) No income, all carry — bleeding monthly
Liens Encumbrances forcing resolution
The compound play: one distress flag is noise. Absentee + high-equity + long-hold + a distress flag is a phone call that closes. Stacking is the whole game — see Play A in the-plays.

ALT DATA — the real edge

Anyone can pull an assessor record. The differentiated alpha is in data most operators never touch.

Source Edge it buys
Aerial / satellite imagery Roof age, lot size, deferred maintenance — condition without a visit
Street-view imagery Curb condition, occupancy cues, neighborhood feel at scale
Short-term-rental yields (AirDNA) True income ceiling for STR-viable assets
Permit velocity Where capital is actually flowing, block by block
Business openings / closings Leading edge of neighborhood momentum
Why alt data wins
EDGE Condition + income + momentum signals most operators never pull
COMMODITY Assessor + sale records everyone already has
Fig. 2 — Commodity data gets you to the table. Alt data wins the hand.

The data ladder — free first, paid later

This is the cost-discipline spine of the whole engine. County-level alpha is real and free. Paid data is an amplifier you earn into — never a starting cost.

$0
to start — public data first
3,000+
U.S. counties publishing records
1
market proven before any paid feed
Free / public — start here Paid — layer in once ROI is proven (needs approval)

FREE / public — start here

Source What it gives us
County assessor + recorder Ownership, sales history, assessed value, beds/baths/sqft
Tax-delinquency lists The cleanest distress signal, published by the county
Court foreclosure / lis pendens filings Pre-foreclosure pipeline, straight from the docket
City code-violation & building-permit open data Distress + reno activity, often via open-data portals
Census / ACS Demographics, income, in-migration
HUD Fair Market Rents Free rent baseline for underwriting
USPS vacancy Vacant-property flag — carry-cost distress
BLS Jobs & employment trend by metro
Source What it adds Note
ATTOM / CoreLogic / Reonomy Bulk, cleaned, national property data Speed & coverage, not new signal
MLS feed (RESO / RETS) Live on-market listings Needs a license
PropStream / BatchLeads Pre-built motivated-seller lists Convenience layer over public data
AirDNA STR yield data Only where STR is the thesis
HouseCanary AVM / valuation model Sharpens, doesn't replace, our model
Rentometer Rent comps Validates the HUD baseline
FREE County assessor, tax-delinquency, foreclosure docket, code/permits, Census, HUD FMR, USPS, BLS — the full signal stack, $0
PAID ATTOM/CoreLogic, MLS, PropStream, AirDNA, HouseCanary — faster and broader, but an amplifier you earn into
The point: the county-level alpha is real and free. Paid data is an amplifier you earn into, not a starting cost — and the constraint is a feature. Building on free, fragmented public records forces sharper models, cleaner pipelines, and a real underwriting edge before a dollar of feed spend. Prove a market on free data; let the ROI buy the upgrade.

Next: the-plays turns these signals into five concrete find-X strategies — and makes the call on which one to run first.