Pulled once via OS-mediated MLX cache from HuggingFace mirror (mlx-community/gemma-4-e4b-it-4bit). Zero monthly hosting fee; weights live on user's disk.
src: LLMRegistry.gemma4_e4b_it_4bit · sanctioned egress per CLAUDE.md inv#1
$0 · user GPU
03 · Inference
MLX runtime
mlx-swift-lm on Apple Silicon. TTFT 192 ms median (n=10, M1 Max, PR-Λ verify-2). User pays the electricity, not the API.
Pinecone vector DB (wondering log embeddings, Standard)
$70 base + usage
~$285.00
E
Vercel Pro (preview + admin)
$20 / seat × 6
~$120.00
Cloud-equivalent recurring total
—
~$5,403.00
∂cost/∂user ≈ $1.08 / student / month · linear with usage · indefinite recurrence
Delta · why this stack wins on the ledger
−$5,403
Δ Monthly
Recurring savings
Cloud build − this build = $5,403 − $0 = $5,403 / mo avoided. Over a 9-month school year: ~$48,627 per district per year not spent on SaaS.
$0 vendor lock-in
Δ Risk
Zero vendor exposure
No API keys to rotate. No price-hike letters. No deprecation notices. The bill cannot go up because there is no bill.
$0 egress
Δ Privacy bill
No data-egress fees, no DPA
NO-CLOUD invariant means student utterances never leave the Mac, so the procurement officer also avoids the unbudgeted line item: legal review of a third-party data-processing agreement.
Sources · CLAUDE.md absolute invariant #1 (no network.client / network.server) · HeWasSocrates.entitlements · claudedocs/bench/2026-05-06-latency-bench.json (PR-Λ TTFT 192 ms n=10) · function_call_contract.yaml · idea.json §3 General (GitHub Pages publication target). Cloud unit prices are public list rates as of 2026-05-07; treat the ~$5,403 figure as a representative counterfactual, not a binding quote. All metrics on this page resolve to a commit, bench file, or vendor pricing page — zero unsourced numbers per spec success criterion #2.