The Intelligence Layer
This is the architecture behind the-deal-engine — the technical heart, kept readable. One principle governs everything below it.
That single rule is why the system can be trusted with capital. Estimates are allowed to be fuzzy and improvable. The economics that decide whether money moves are exact, auditable, and the same every time.
The six layers
Read bottom to top. Each layer feeds the one above it; the loop at the top feeds all of them back down.
1 — Canonical property graph
Every property is a node, linked to its owner(s), its transaction history, its market, and its signals. The hard, valuable part is entity resolution: the same house shows up across county records, listing feeds, and tax rolls under slightly different names and addresses. We dedup all of it into one clean record with one owner graph. It's the same idea as a knowledge graph, applied to real estate.
2 — Feature store
Computed signals per property: equity %, hold-time, rent-gap, distress flags, comp set, demographics. Versioned and reusable, so a signal computed once feeds every model that needs it — and we can always reproduce exactly what a model saw. → signals-and-data
3 — The models (the brains)
A portfolio, not a monolith. Each model does one job well:
- AVM — estimated value
- Rent model — achievable rent
- Rehab / condition estimator — cost to make ready
- Motivation-to-sell classifier — how likely this owner sells, and why
- Mispricing detector — gap between price and worth
- Market opportunity index — where to hunt
4 — The underwriting engine
This is where the probabilistic stops and the deterministic begins. The models hand it estimates; it computes the economics — cap rate, cash-on-cash, DSCR, ARV, IRR — and applies kill-floors that pass or reject the deal. The math already lives in stored procedures: it's a stored procedure, not a vibe. → Research Pipeline
5 — Ranking & decision
Combine deal quality, motivation, market strength, and confidence into a single ranked list, each row carrying a verdict — pursue / watch / pass — measured against a configurable buy-box. Different lanes and different investors run different buy-boxes, so the same property can be a "pursue" for one and a "pass" for another.
6 — Learning loop + explainability
Outcomes recalibrate every model, closing the loop that makes the whole system compound. And every recommendation ships with its why — the specific signals that drove the score. The loop is the moat; the explainability is the trust.
Where AI fits — and where it doesn't
The quantitative core stays deterministic. The LLM layer (DSPy) handles language, not arithmetic:
- parsing listing descriptions and permit text into structured signals
- generating the narrative deal summary a human reads
- answering natural-language "find me X" queries against the graph
The math is never an LLM. The LLM is the interpreter and the narrator; the underwriting engine is the accountant.
Confidence and data quality
Every estimate carries a confidence score. Low-confidence deals don't get quietly buried or blindly trusted — they get flagged for human verification before any capital decision. The system knows what it doesn't know, and says so.
Related
- the-deal-engine — the overview this architecture serves.
- kpis-and-reports — how model accuracy and the loop's gains are measured.
- signals-and-data · the-plays · deal-lifecycle · the-connector · tokenomics