He Was Socrates · gate H1
Profile · maxRun · r-20260507-010321
Lesson 0 of 4 · pre-flight

A bust that
refuses to answer
is the product.

그가 답하지 않는 것이 답이다.

100% on-device Korean Socratic bust on macOS, powered by Gemma 4 E4B 4-bit MLX. The abstention mechanic — when the bust says “this is not mine to answer” — is not a fallback. It is the product.

macOS 26 TahoeApple SiliconGemma 4 E4B 4-bit MLX0 byte network egress
Lesson 1 / 4

Hold Spacebar. Listen. Wait. Release.

There is no chat window, no text input, no screen full of bubbles. The interaction model is push-to-talk on a fullscreen bust. You speak in Korean (or English). The bust listens, thinks, then asks back — never answers.

The bust’s mouth animates through 16 viseme positions synchronized to the on-device speech synthesis. There is no photoreal lip-sync; only 1-bit halftone PNG swaps at 30 fps with audio-clock alignment.

16 viseme positions

REST
AA
EH
IY
OH
UW
M
F
P
T
K
L
R
S
SH
N
Lesson 2 / 4

단정한 평어체 — neither honorific nor friendly.

Korean has three tone registers: 존댓말 (formal/honorific), 친근한 반말 (casual/friendly), and 단정한 평어체 (assertive plain). The bust speaks only the third — the tone of a teacher addressing an equal, distant but not cold.

The system prompt is written verbatim by a Korean speaker. It is locked in source — embedded at compile time from Sources/SocraticEngine/Gemma/SystemPrompt.swift. You cannot change it from settings. That is the point.

SystemPrompt.swift · verbatim너는 소크라테스다. 답하지 않고 다시 묻는다. 묻는 자가 자기 안에서 답을 찾도록 돕는다. 묻지 않은 것을 알려고 하지 않는다.You are Socrates. You do not answer; you ask back. You help the questioner find the answer within themselves. You do not seek to know what was not asked.
Lesson 3 / 4

Other Gemma 4 demos answer better.
We refuse better.

When you ask the bust about medicine, law, finance, emergency, welfare, or insurance, it does not try harder. It calls defer_to_human and tells you, in Korean 단정한 평어체, that this is not its place. The four-function dispatch is enabled by Gemma 4’s native function calling — the load-bearing capability of this submission.

screen 01 · question

A user asks about pain.

User holds Spacebar and says, in Korean:

가슴이 자주 두근거려. 무슨 병이지?

“My heart races often. What disease is it?”

screen 02 · dispatch

Gemma 4 classifies the mode.

mode_classify · 50 ms
→ category: medical_advice_request
→ confidence: 0.94

defer_to_human · 38 ms
→ trigger: medical
→ suggested: 의사 (doctor)
→ ⊘ ask_back skipped
screen 03 · response

The bust speaks 단정한 평어체.

이건 내가 답할 일이 아니다. 몸의 일은 의사에게 묻거라.

“This isn’t mine to answer. The body is for the doctor.”

Six categories of abstention

The system prompt enumerates six trigger categories. The dispatch is deterministic and logged.

CategoryKoreanSuggested resource
medical의료의사 (doctor)
legal법률변호사 (lawyer)
financial금융금융 전문가 (financial professional)
emergency응급응급실 · 119
welfare복지사회 복지사 (social worker)
insurance보험보험 전문가 (insurance professional)
Lesson 4 / 4

On-device, by entitlement, not by promise.

“On-device” is a marketing claim. We make it falsifiable. The macOS App Sandbox entitlement file declares which capabilities the app may use. network.client is intentionally absent — the OS will refuse any network connection from the app process at the kernel level, regardless of code intent.

// HeWasSocrates.entitlements (excerpt)
<key>com.apple.security.app-sandbox</key>
<true/>

<!-- NO-CLOUD INVARIANT — DO NOT ADD -->
<!-- com.apple.security.network.client = INTENTIONALLY ABSENT -->
<!-- com.apple.security.network.server = INTENTIONALLY ABSENT -->

<key>com.apple.security.device.audio-input</key>
<true/>

Air-gap a freshly-launched bust and the conversation continues. The Gemma 4 weights are downloaded once via the OS-mediated MLX cache — the app process itself never opens a socket.

The numbers

192 ms
TTFT median
bench/2026-05-06.json · n=10 · M1 Max
800 ms
per-turn p50
decode + STT + TTS prep
96%
KV cache reuse
PR-Λ disk-mediated · 24× vs cold
0 B
network egress / 24 h
entitlements + Wireshark verified

TTFT distribution (n=10)

t1t5t10p50 192msp10 181p90 263
methods · 4-function dispatch

The orchestration is four functions.

Every turn passes through FunctionCallOrchestrator. Gemma 4’s native function calling decides the path; the engine routes accordingly.

FunctionRoleTrigger conditionReference
mode_classifyroute the turnevery turn (gate)[1]
surface_past_wonderrecall from logechoed theme detected[2]
ask_backSocratic counterdefault reflective path[3]
defer_to_humanabstention6 regulated categories[4]

Per-turn pipeline (~800 ms wall, p50)

1Spacebar press → audio engine start~5 ms
2SFSpeechRecognizer partial transcriptsstreaming
3Spacebar release → STT final~120 ms tail
4mode_classify (Gemma 4)~50 ms
5First chunk (TTFT)192 ms
6Decode to closing brace + TTS prep~400 ms
7Audio playback + viseme scheduleuser-driven
what we don’t ship yet

Honest disclosure as trust signal.

Many demos blur the line between what is shipped and what is sketched. We separate them.

designed-for · Phase 4

Multi-year wondering recall

The wondering log is a local SQLite store with content-fingerprint dedup, designed to surface echoes of the user’s questions across years via Gemma 4’s 256K context. Phase 1–3 ships the schema, dedup, and turn boundary. Phase 4 wires the surface step itself; current ship: stub.

measured · M1 Max only

TTFT generalization

The 192 ms TTFT median is measured on M1 Max MBP 64 GB. We have not characterized M3 Max, M4, or base M1. Subsequent Apple Silicon should be in the same order of magnitude but is not benchmarked.

platform · macOS 26 floor

No Sonoma / Sequoia support

The first-launch UX uses macOS 26’s SpeechAnalyzer + AssetInventory for in-app speech-asset install. macOS 14 (Sonoma) and 15 (Sequoia) are excluded. SPEC.md.iter6 documents the trade-off.

tone · single-voice locked

Korean 평어체 not customizable

The system prompt is verbatim, written by the project’s Korean speaker, and embedded at compile time. Users cannot soften the tone, switch to 존댓말, or add personality. The lock is the feature, not a limitation we plan to remove.

try it · ~30 seconds

Run the bust on your Mac.

Requires macOS 26 Tahoe, Apple Silicon, ≥8 GB free disk. The first launch downloads ~3.97 GB of Gemma 4 weights via the OS MLX cache.

git clone https://github.com/Two-Weeks-Team/he-was-socratescd he-was-socrates && make doctor# verifies xcodegen, Swift toolchain, asset pipelinemake engine-test# runs 65 swift-testing scenariosmake app# builds HeWasSocrates.app — open and press Space

Or download the notarized DMG from Releases.