Hello.

A sandbox for agent skills that produce reviewable artifacts.

When an agent finishes a coding task, the human review usually falls back to either reading the diff or rubber-stamping it because the diff is too large to read. Neither preserves real human judgment in the loop.

The skills iterated on in this repo replace that surface with per-change interactive artifacts — narrative walkthroughs of what changed and why, with live before/after data, written to be legible to PMs and QA, not just engineers reading code.

View types an artifact can compose

A single feature diff can be broken into pieces shaped for different readers and different questions. Below: the view types these skills currently produce, each with a tiny example.

TL;DR header

A one-sentence summary up top, before any code. The reader who only opens the page for ten seconds still leaves with the gist.

Feature Review · 17

Group the inbox by sender

Conversations from the same sender now collapse into one row. Click a row to expand. Default state matches old behavior.

Inline highlight

Mark the words a reviewer should not skim past. Used sparingly — highlighting everything is highlighting nothing.

The change adds a single field to the request payload. Existing clients keep working because the new field is optional and defaulted to null. The only behavior shift happens when the field is present and non-empty.

Numbered step

Progressive disclosure. The reviewer opens only the steps that apply to them — PM might read step 1, an engineer skips to step 3.

Annotated code diff

A small diff with the prose annotation right next to it — not above, not in a separate file.

src/server/handler.ts
 function handle(req, res) {-  const id = req.params.id;+  const id = parseId(req.params.id);   return load(id); }

parseId throws on malformed input. The old code would have 500’d on the database call instead, with a less useful stack.

Before / after

Side-by-side data shape. A schema change becomes visible at a glance without reading the migration.

Before
{
  "id": "abc123",
  "title": "Demo task"
}
After
{
  "id": "abc123",
  "title": "Demo task",
  "status": "open"
}
Audience tiers

Same change, two layers of explanation. PM and engineer read the same artifact and find their own depth.

PM tier

The list now loads about three times faster on big accounts. No visual change — same rows, same order.

Engineer tier

Replaced N+1 fetch with a join + group-by; added covering index on (account_id, created_at). p95 600ms → 180ms.

Conversational reveal

Chunks reveal one at a time, paced by the reviewer. A side rail shows where you are without spoiling what’s next.

2. Where the new check lives

The validator now runs once at the request boundary instead of inside each handler. Handlers stay shorter and a forgotten call site can’t bypass it.

Show me everythingContinue on to 3. Rollout
What to double-check

A short, specific punch list. Reviewer-facing, not engineer-facing — phrased as things to do, not things to know.

  • Send a request that omits the new field — confirm it still works.
  • Look for callers in older modules that might still hit the old path.
  • If load looks the same after deploy, check the cache hit rate didn’t drop.
Files touched

The engineer-tier ‘what was actually edited’ table. Optional reading, last in the artifact, for the curious.

FileWhat happened
src/api/router.tsNew route registered
src/lib/auth.tsValidator factored out of two callers
tests/auth.test.tsThree new cases for the boundary check
Action footer

The reviewer’s exit point. Same affordance as a code-review tool, attached to a narrative artifact instead of a diff.

Review artifact · sample
Planning views

The other skill is forward-looking — a planning walk before the code is written. Different palette: paper, pen, mentor voice.

Before you reach for state, ask: what’s the smallest piece of this that has to be remembered between requests?

— Mentor prompt

Sketch this on paper before continuing:

Draw the boxes that hold state. Draw an arrow into each box showing what writes to it. If a box has more than one arrow in, circle it.

  request ──▶ [ validator ] ──▶ [ handler ] ──▶ store
                  │
                  └── on fail ──▶ 4xx (no store write)