ShortFlix · perf
live · region asia-northeast3 · p95 last 5min

Cold-Start to First Card · target ≤ 1500 ms

412 ms
PWA shell paint · CDN edge
1.18 s
orchestrator → first card · p50
1.94 s
orchestrator → first card · p95

Multi-agent flame · single trace

orchestrator · ADK · Cloud Run1180 ms
curator-agent · gemini-2.0-flash · embed taste450 ms
unified-search-agent · MCP fanout730 ms
mcp/rapidapi-yt-shorts450 ms
mcp/rapidapi-ig-reels510 ms
mcp/rapidapi-tiktok430 ms
trend-safety-agent · vertex-search ground220 ms
re-rank · novelty × diversity160 ms
orchestrator (ADK) curator (Gemini) search (Gemini+MCP) MCP tool call (RapidAPI) trend-safety + grounding

Latency heat-map · last 60 min × 24 cols

−60m−30mnow

p50/p95 · 24h sparkline

p50p95

Deploy / region

runtime
Cloud Run
min=1 max=4
region
asia-NE3
seoul
cdn
Cloud CDN
+ Vercel edge
model
gemini-2.0-flash
streaming

Perf budget vs actual

FCP412 ms / 600 ms
first card (p50)1180 ms / 1500 ms
first card (p95)1940 ms / 1500 ms
cold cache hit-rate71% / 60%
$ / 1k requests$0.084 / $0.10

Why this matters for the demo · judges feel speed before they read code

The 1-2 minute video opens with a stopwatch overlay: tap → first curated card in 1.18 s. Streaming Gemini Flash + MCP parallel-fanout + edge-CDN PWA shell makes a multi-agent system feel like a single fast app. Single-agent baseline measured in same harness: 3.1 s (no parallel fanout, no curator-cache). Speed delta = innovation proof on screen.