๐Ÿ“‹ Proposal ยท Awaiting Verification

WhyC v2 โ€” Team Brief

While they hire, we ship โ€” and the agent panel adjudicates the build. v1 ships in 31 days. v2 is the runtime-level redesign that turns WhyC from "another vibe-coding tool" into a 13-sub-agent panel that converges on a build via structured adjudication.

Hackathon ยท Google Cloud Rapid Agent Track ยท Arize Deadline ยท 2026-06-11 14:00 PT D-30 Credit ยท โœ… requested ยท redeem by 2026-06-04
Sub-agents in v2
13
vs 0 in v1
Score projection
89โ€“94
+17โ€“22 over v1
Cost / 12-run demo
~$37
of $100 credit (62%)
GCP + Phoenix features
9 + 5
vs 4 + 1 in v1

WhyC v2 โ€” Team Brief

One-page summary of the v2 architecture. Full detail lives in architecture-v2-pdd-on-runtime.md. This file is for sharing with teammates in chat / DM / PR before the verification meeting.


The Hero

While they hire, we ship โ€” and the agent panel adjudicates the build.

WhyC is a satirical counter-product for VC-backed YC teams that take six months to ship what an agent can produce in a day, and a practical fast-POC accelerator for any founder. v1 ships in 31 days; v2 is the runtime-level redesign that turns WhyC from "another vibe-coding tool" into a 13-sub-agent panel that converges on a build via structured adjudication.

v1 (current) v2 (proposed)
Per-stage perspectives 1 (single LLM call) 3 / 5 / 5 (analyze / develop / judge)
Total sub-agents 0 13
Diversity validation None I2 Jaccard + structural hash
Learning across runs None BigQuery queries past outcomes
GCP features used 4 9
Phoenix features used 1 5
Differentiation vs Bolt / Lovable / v0 Weak Structurally unprecedented

The Pipeline (7 stages)

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Stage 0  pre-flight                                                    โ”‚
โ”‚            URL โ†’ sanitize โ†’ content_sha256 cache                        โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Stage 1  analyze              3 advocate analyzers (Flash)             โ”‚
โ”‚                                  โ†’ synthesis (Pro)                      โ”‚
โ”‚                                  โ†’ 1 ProductSpec with provenance        โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Stage 2  go / no-go           6 rules + Vertex AI Eval IP-safety       โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Stage 3  develop              5 advocate developers (Pro)              โ”‚
โ”‚                                  โ†’ I2 dedup                             โ”‚
โ”‚                                  โ†’ cross-pick winner                    โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Stage 4  deploy               Cloud Build โ†’ Cloud Run (real)           โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Stage 5  judge                5 specialist critics (Pro)               โ”‚
โ”‚                                  โ†’ meta-tally spec_fit                  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Stage 6  introspect           Phoenix MCP self-query                   โ”‚
โ”‚                                  โ†’ trace summary + experiment compare   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Stage 7  self-improve         judge + trace + BigQuery learning        โ”‚
โ”‚                                  โ†’ converge | regen | ceiling           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Three loops, not one:

  1. Within-iteration loop โ€” 5 developers compete, 5 critics judge, winner picked
  2. Across-iteration loop โ€” judge spec_fit + trace introspection decide regen
  3. Across-run loop โ€” BigQuery accumulates outcomes, future runs query history

Why This Wins (4 scoring axes, 25 pts each)

Axis v1 estimate v2 estimate What changed
Tech Implementation 17 23โ€“24 Agent Builder + Vertex Eval + BigQuery learning + Phoenix 5-feature
Design 18 21โ€“23 5 design lenses โ†’ adjudicated winner is by construction the consensus
Potential Impact 18 21โ€“22 Learning loop demonstrates "agent gets smarter run by run"
Quality of Idea 19 24โ€“25 PDD-on-Runtime is structurally unprecedented in the gallery
TOTAL / 100 ~72 ~89โ€“94 +17โ€“22 points

The Quality of Idea axis is the biggest swing. v1 in the gallery reads as "another AI builds an app." v2 reads as "an agent panel structurally adjudicates the build" โ€” judges have not seen this pattern.


What It Costs

Per converged run (3 iter average):     ~$3.12
12 demo dataset runs:                   ~$37
Buffer for retries + experiments:       ~$25
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
TOTAL projected:                        ~$62 of $100 credit (62 %)
Margin remaining:                       ~$38 (38 %)

We are well inside the $100 credit. Retry budget is generous; even with worst-case retries on every stage, projected stays under $80.


How Long (D-30 โ†’ D-0)

Week Window Work
WK1 D-30 โ†’ D-23 Stage 1 multi-analyzer ยท Stage 3 multi-developer ยท Stage 5 5-critic ยท BigQuery schema ยท retry framework. Credit redeems this week (deadline 2026-06-04).
WK2 D-22 โ†’ D-16 Stage 4 real Cloud Build + deploy ยท Stage 2 Vertex Eval ยท context-preservation tests ยท DRY_RUN E2E
WK3 D-15 โ†’ D-9 YC scraper ยท 12 verified companies ยท learning loop runs 10ร— into BigQuery ยท video script
WK4 D-8 โ†’ D-3 Agent Builder console screenshots ยท video recorded ยท README badges ยท Devpost description
WK5 D-2 โ†’ D-0 Final rehearsal ยท submit D-1 (2026-06-10) with 1h buffer

Three Things to Verify Before We Build

Per architecture-v2-pdd-on-runtime.md ยง11 โ€” we walk through these together before any v2 code lands:

  1. Agent Builder console actually supports the sub-agent registration pattern we describe. If it doesn't, the 13-sub-agent structure has to be implemented via direct Vertex AI SDK calls (which works, but loses one of the GCP feature signals).
  2. Gemini current pricing matches our $3.12/run projection. Flash + Pro rates may have changed since the project started โ€” re-check against console.
  3. BigQuery free tier covers the per-run insert volume. Conservative estimate is ~50 rows per run ร— 100 runs = 5 K rows / month, well within free tier โ€” but confirm before wiring.

If all three pass โ†’ architecture-v2-locked.md is created and implementation begins. If any fail โ†’ degraded path documented and locked.


What's NOT in v2

These were considered and explicitly held back because they don't move scoring within the hackathon window:


Operational Notes



Status

๐Ÿ“‹ Proposal โ€” awaiting verification. No v2 code has been written. The v1 pipeline (analyze ยท go-no-go ยท develop ยท deploy ยท judge ยท introspect ยท self-improve) is live, typechecked, and builds clean across apps/api ยท apps/web ยท apps/jobs. v1 deferred items become v2's expansion points.

When the team has read this brief and the verification points clear, the implementation order is:

  1. BigQuery schema + retry framework (foundation)
  2. Stage 1 multi-analyzer (lowest-risk multi-advocate stage to validate the pattern)
  3. Stage 3 multi-developer (highest-impact)
  4. Stage 5 5-critic (highest-cost; validate against budget before committing)
  5. Stage 6 Phoenix MCP extensions
  6. Stage 7 BigQuery learning
  7. Stage 4 real Cloud Build + Cloud Run deploy
  8. End-to-end DRY_RUN test
  9. Real dataset (WK3 scrape) + final tuning