TTFT — first tokenPR-Λ verify-2
192ms p50
p95 241msn 10baseline 4 612msΔ 24×
src: claudedocs/bench/2026-05-06-latency-bench.json
KV cache · disk-mediatedhit
94.2% hit
cold loads 1warm reuses 17cache size 312 MB
src: PR #32 disk-mediated KV reuse · PR-Λ
Audio · STT engineon-device
idlephase
SFSpeech onDevice=truelocales ko_KR · en_USbuf 0/1024
src: AudioInputManager.swift · SPEC iter6
Viseme · frame budgetin-spec
28.7ms / frame
target ≤33.3msfps 30drift 11msRM-fps 12
src: VisemeDriver.swift · >50ms drift triggers alert
Function dispatch · 24h4 / 4 OK
1 246calls
mode_classify 612ask_back 518defer 94surface 22
src: function_call_contract.yaml · Gemma 4 native fc
Model weight integrityverified
SHA-256match
variant gemma-4-e4b-it-4bitsize 3.97 GBlast 09:00
src: mlx-community via OS-mediated MLX cache
Entitlements auditNO-CLOUD
0net entitlements
network.client absentnetwork.server absentaudio-input granted
src: HeWasSocrates.entitlements · CLAUDE.md invariant #1
Asset inventory · STT modelsinstalled
2 / 2locales
ko_KR 198 MBen_US 176 MBinstall single
src: Preflight.swift (PR #33) · macOS 26 AssetInventory
Wondering log · on disklocal-only
14.3MB · 412 entries
oldest 2026-02-11dedup-fp SHA-256egress 0 B
src: Core Data · Phase 4 surface stub wired