CS-001 / Consumer lending / Eval-first build
An underwriting copilot that loan officers actually trust.
Twelve underwriters reviewing 800 applications a day. Average decision time was 14 minutes; senior staff were spending half their week on the easiest 60% of files
// problem
Twelve underwriters reviewing 800 applications a day. Average decision time was 14 minutes; senior staff were spending half their week on the easiest 60% of files. Two prior in-house attempts at automation had been pulled after they declined applications they should have approved.
// constraints
- Regulated environment — every decision must be explainable in writing
- Adverse action notices need a citable model rationale
- Cannot replace human judgment; must augment it
// approach
What changed.
Discovery
Two weeks shadowing senior underwriters; built a 412-case eval suite drawn from 2 years of historical decisions including the contested ones.
Foundations
Built the eval surface first. Every commit ran the suite; merges blocked on regressions in approval-parity, false-decline rate, and reasoning quality.
Build
Retrieval over policy documents + structured-output reasoning trace. Output is a decision recommendation with cited policy clauses, never a final decision.
Handoff
Integrated into the existing case-management UI as a side panel. Underwriters can accept, modify, or override; every override becomes a new eval case.
// results
Measured outcomes.
Avg. decision time
↓ 64%
14 min → 5 min
Senior-staff time on routine files
↓ 34 pts
52% → 18%
False-decline rate (vs. baseline)
↓ 9%
1.0× → 0.91×
Underwriter trust score (internal NPS)
↑ 35 pts
+12 → +47
We came in with a list of things AI couldn't be trusted to do. They built around that list, not around it.
VP of Credit Risk
Twelve underwriters reviewing 800 applications a day. Average decision time was 14 minutes; senior staff were spending half their week on the easiest 60% of files. Two prior in-house attempts at automation had been pulled after they declined applications they should have approved.
An eval suite is also a regulatory artifact. The 412-case suite is now part of the lender's model-risk-management documentation, which means it has to keep growing.