CS-001 / Consumer lending / Eval-first build

An underwriting copilot that loan officers actually trust.

Twelve underwriters reviewing 800 applications a day. Average decision time was 14 minutes; senior staff were spending half their week on the easiest 60% of files

14 weeks 2025 A US-based mid-market consumer lender with $4B in originations.

// problem

Twelve underwriters reviewing 800 applications a day. Average decision time was 14 minutes; senior staff were spending half their week on the easiest 60% of files. Two prior in-house attempts at automation had been pulled after they declined applications they should have approved.

// constraints

Regulated environment — every decision must be explainable in writing
Adverse action notices need a citable model rationale
Cannot replace human judgment; must augment it

// approach

What changed.

Discovery

Two weeks shadowing senior underwriters; built a 412-case eval suite drawn from 2 years of historical decisions including the contested ones.

Foundations

Built the eval surface first. Every commit ran the suite; merges blocked on regressions in approval-parity, false-decline rate, and reasoning quality.

Build

Retrieval over policy documents + structured-output reasoning trace. Output is a decision recommendation with cited policy clauses, never a final decision.

Handoff

Integrated into the existing case-management UI as a side panel. Underwriters can accept, modify, or override; every override becomes a new eval case.

// results

Measured outcomes.

Avg. decision time

↓ 64%

14 min → 5 min

Senior-staff time on routine files

↓ 34 pts

52% → 18%

False-decline rate (vs. baseline)

↓ 9%

1.0× → 0.91×

Underwriter trust score (internal NPS)

↑ 35 pts

+12 → +47

We came in with a list of things AI couldn't be trusted to do. They built around that list, not around it.
VP of Credit Risk

Twelve underwriters reviewing 800 applications a day. Average decision time was 14 minutes; senior staff were spending half their week on the easiest 60% of files. Two prior in-house attempts at automation had been pulled after they declined applications they should have approved.

An eval suite is also a regulatory artifact. The 412-case suite is now part of the lender's model-risk-management documentation, which means it has to keep growing.