He Was Socrates
Submission: The Gemma 4 Good Hackathon (Kaggle × Google DeepMind, $200K, deadline 2026-05-19 08:59 KST). Categories: Impact-focused / Education (primary) · Technical (secondary). See § Hackathon submission.
Quick start
Tested on macOS 26 Tahoe, M1 Max / M2 Pro / M3. Requires Xcode 15.2+ for the .app target; engine layer builds with Command Line Tools alone.
# clone git clone https://github.com/Two-Weeks-Team/he-was-socrates.git cd he-was-socrates # 1. toolchain audit (Swift / Xcode / xcodegen / python3 / swift-format) make doctor # 2. engine layer — 65 swift-testing scenarios, ~1.5 s on M2 Pro make engine-test # 3. app target (regenerates xcodeproj, builds Debug, opens fullscreen bust) make xcodeproj && make app open apps/macos/HeWasSocrates/build/Debug/HeWasSocrates.app # first launch downloads ~3.97 GB Gemma 4 E4B 4-bit weights via OS-mediated MLX cache # (only sanctioned network egress — entitlements file has no network.client)
Press Space to talk. Release to think. The bust returns a question, not an answer.
Why on-device
- Zero bytes leave the device. No
network.client/network.serverentitlement. Verify: apps/macos/HeWasSocrates/Resources/HeWasSocrates.entitlements. - STT is on-device.
SFSpeechRecognizerwithrequiresOnDeviceRecognition = true. Pull the WiFi mid-demo — nothing breaks. - The abstention mechanic is the product.
defer_to_humanhandles medical / legal / financial / emergency / welfare / insurance topics. Honest hand-off, not an answering machine. - Korean tone is locked. 단정한 평어체 — neither 존댓말 nor friendly-banmal. System prompt is verbatim user-authored at Sources/SocraticEngine/Gemma/SystemPrompt.swift.
Architecture
Spacebar press
→ AudioInputManager // SFSpeechRecognizer + AVAudioEngine, push-to-talk
→ FunctionCallOrchestrator // Gemma 4 native function calling
├── mode_classify
├── surface_past_wonder // 256K context, multi-year recall (Phase 4 wiring)
├── ask_back // returns a Socratic question
└── defer_to_human // ⊘ for med/legal/fin/emerg/welfare/insurance
→ GemmaService // mlx-swift-lm, 4-bit MLX, 3.97 GB
→ TTSManager // AVSpeechSynthesizer (Yuna ko / Samantha en)
→ VisemeDriver // 30 fps, audio-clock synced, 16 PNG swap
→ WonderingLog // Core Data, SHA-256 dedup, never leaves device
See runs/2026-05-05-spec/spec/SPEC.md (lock SHA e5dfadf2c8…314c5) and function_call_contract.yaml for the frozen contract.
Performance
| Metric | Value | Source |
|---|---|---|
| TTFT (median, n=10) | 192 ms | claudedocs/bench/2026-05-06-latency-bench.json |
| TTFT improvement (PR-Λ) | 4.6 s → 192 ms (24×) | PR #32 |
| Per-turn user-facing | ~2 s | decode + STT endpoint + TTS prep |
| Network egress at runtime | 0 bytes | entitlements + Charles proxy log |
| Tests | 65 / 65 passing | make engine-test |
Hackathon submission
This repo is the canonical artifact for The Gemma 4 Good Hackathon. The hosted Kaggle preview at two-weeks-team.github.io/he-was-socrates is a reference implementation of the README — judges should treat the repo as source of truth.
| Rubric axis | Evidence in this repo |
|---|---|
| Impact / Education (primary) | SystemPrompt.swift · function_call_contract.yaml § defer_to_human |
| Technical execution (secondary) | PR #32 PR-Λ · claudedocs/bench/ · PR #33 Preflight |
| Constrained environments | HeWasSocrates.entitlements · App Sandbox + 0 byte egress |
| Function calling | 4-function dispatch · FunctionCallOrchestrator.swift |
| Long context (256K) | surface_past_wonder over WonderingLog (Phase 4 wiring; current: stub for surfacing logic) |
| Configurable thinking mode | Phase.thinking soft pulse on bust · EngineCoordinator.swift |
Contributing
See CONTRIBUTING.md. TL;DR: feature branches only, Conventional Commits with scopes from {engine, viseme, audio, gemma, app, scripts, ci, docs, spec}, run make ci-local before pushing, PRs merge with gh pr merge --merge (squash forbidden — history is preserved).
Open issues are labeled good-first-issue for asset-pipeline determinism and help-wanted for additional viseme phoneme markers. View open issues →
License
Code under Apache-2.0. Content (Korean system prompt, halftone aesthetic, 16 viseme PNGs, README copy) under CC-BY-4.0. Gemma 4 weights subject to Google DeepMind's Gemma Terms of Use.
Attribution: Gemma 4 — Google DeepMind. MLX / mlx-swift-lm — Apple ML team. Korean 단정한 평어체 system prompt — Two-Weeks-Team (verbatim, frozen invariant).