The Intelligence Layer

This is the architecture behind the-deal-engine — the technical heart, kept readable. One principle governs everything below it.

The governing principle: Separate the probabilistic from the deterministic. Models guess value, rent, rehab, and motivation. The underwriting engine computes the returns from those guesses. Never blend the two — don't let an LLM do the math.

That single rule is why the system can be trusted with capital. Estimates are allowed to be fuzzy and improvable. The economics that decide whether money moves are exact, auditable, and the same every time.

Probabilistic: the modelsDeterministic: the underwritingNever blended

The six layers

Read bottom to top. Each layer feeds the one above it; the loop at the top feeds all of them back down.

The stack — bottom to top

Learning loop + explainability

Outcomes recalibrate every model. Every recommendation ships with its "why."

Ranking & decision

Deal quality + motivation + market + confidence → ranked list with a verdict against a buy-box.

Underwriting engine

Deterministic cap rate / cash-on-cash / DSCR / ARV / IRR + kill-floors. A stored procedure, not a vibe.

The models

A portfolio of single-job brains: value, rent, rehab, motivation, mispricing, market index.

Feature store

Versioned, reusable signals per property — the inputs the models eat.

Canonical property graph

Every property a node, linked to owners, transactions, market, and signals.

Fig. 1 — The intelligence layer. Layer 4 stays deterministic by design; the moat lives in layers 3 and 6.

1 — Canonical property graph

Every property is a node, linked to its owner(s), its transaction history, its market, and its signals. The hard, valuable part is entity resolution: the same house shows up across county records, listing feeds, and tax rolls under slightly different names and addresses. We dedup all of it into one clean record with one owner graph. It's the same idea as a knowledge graph, applied to real estate.

2 — Feature store

Computed signals per property: equity %, hold-time, rent-gap, distress flags, comp set, demographics. Versioned and reusable, so a signal computed once feeds every model that needs it — and we can always reproduce exactly what a model saw. → signals-and-data

3 — The models (the brains)

A portfolio, not a monolith. Each model does one job well:

AVM — estimated value
Rent model — achievable rent
Rehab / condition estimator — cost to make ready
Motivation-to-sell classifier — how likely this owner sells, and why
Mispricing detector — gap between price and worth
Market opportunity index — where to hunt

4 — The underwriting engine

This is where the probabilistic stops and the deterministic begins. The models hand it estimates; it computes the economics — cap rate, cash-on-cash, DSCR, ARV, IRR — and applies kill-floors that pass or reject the deal. The math already lives in stored procedures: it's a stored procedure, not a vibe. → Research Pipeline

5 — Ranking & decision

Combine deal quality, motivation, market strength, and confidence into a single ranked list, each row carrying a verdict — pursue / watch / pass — measured against a configurable buy-box. Different lanes and different investors run different buy-boxes, so the same property can be a "pursue" for one and a "pass" for another.

6 — Learning loop + explainability

Outcomes recalibrate every model, closing the loop that makes the whole system compound. And every recommendation ships with its why — the specific signals that drove the score. The loop is the moat; the explainability is the trust.

Where AI fits — and where it doesn't

The quantitative core stays deterministic. The LLM layer (DSPy) handles language, not arithmetic:

parsing listing descriptions and permit text into structured signals
generating the narrative deal summary a human reads
answering natural-language "find me X" queries against the graph

The math is never an LLM. The LLM is the interpreter and the narrator; the underwriting engine is the accountant.

Confidence and data quality

Every estimate carries a confidence score. Low-confidence deals don't get quietly buried or blindly trusted — they get flagged for human verification before any capital decision. The system knows what it doesn't know, and says so.

The close: Let the models guess and improve. Let the underwriting compute and never lie. Let the loop make both better every month. That separation is the architecture — and the edge.

the-deal-engine — the overview this architecture serves.
kpis-and-reports — how model accuracy and the loop's gains are measured.
signals-and-data · the-plays · deal-lifecycle · the-connector · tokenomics