Debugging an Agent in Six Minutes

What it looks like when Claude is your design partner, debugger, and prompt writer.

I have an agent on MCProspero that watches for staging deploys. When GitHub fires the deploy webhook, the agent wakes up, pulls the PRs in the deploy, and Slacks me a summary. It’s been running for a week.

On the most recent deploy, the webhook fired. The agent did its thing. I got a Slack message. And it was… fine. Technically correct. But it was just a raw list of PR titles. Ten PRs, ten bullet points, no context. I’d built an agent that could tell me what shipped but not what it meant.

Here’s what happened next. This is a real conversation from Claude Desktop, lightly edited for length. The whole thing took six minutes.

Greg: On the last deploy, the webhook fired and the agent Slacked me all the PRs. I thought it was going to summarize for me. I feel like I just got a raw list.

Claude pulled the agent’s system prompt, then found the actual transcript from the last deploy run — run 1059, 13 tool calls, 28 seconds — and read through what the agent actually did.

Examined system prompt and recent deployment logs…

Claude: Got it — the agent is doing the right work (fetching all 10 full PR bodies with summaries), it’s just not using any of that richness. It falls back to the raw title in the Slack message and throws the rest away.

The system prompt format spec is • <PR title> — by @<author> with no instruction to synthesize anything.

The fix is straightforward — update the format section to tell it to group by theme, write a one-liner from each PR’s Summary, and open with a narrative sentence about what the deploy was about overall. Want me to update the system prompt now?

That’s the moment. Claude didn’t just find the bug — it found the root cause. Here’s the format section from the old system prompt:

• <PR title> — by @<author>

That’s it. The agent was fetching all 10 full PR bodies with rich summaries, and then the format spec told it to throw everything away and just use the title. A classic case of the instructions being too prescriptive in the wrong place.

Greg: Yes

Claude stopped the agent, rewrote the system prompt, and regenerated the manifest.

Restructured deployment summary format narratively…

Claude: Manifest looks clean — same tools (github, slack, checkpoint), same target (your Slack DM), no new domains. Good to approve?

Greg: Maybe also include who all the contributors were

Incorporated contributor attribution into format…

Claude: Good addition. Manifest regenerated — same tools, same target, and the dry run preview shows the narrative format in the Slack message. Ready to approve?

Greg: Please show me the new system prompt

Claude showed me the full system prompt. Remember, the old format section was just • <PR title> — by @<author>. Here’s the whole thing now:

You are a staging deploy summary agent. You fire when GitHub
POSTs a workflow_run webhook event.

## Step 1 — Filter
Only proceed if ALL of these are true: action == "completed",
workflow_run.conclusion == "success",
workflow_run.name == "Deploy to Staging".
If any condition is false, exit silently.

## Step 2 — Get deploy context
Extract workflow_run.head_sha, repository.full_name,
owner.login, repo name, and html_url.

## Step 3 — Find previous deploy SHA
Use get_checkpoint with key last_deploy_sha.

## Step 4 — Find PRs in this deploy
Use github_search_issues for merged PRs since last_deploy_at.
Fetch the full PR body for each using github_get_pull_request —
the Summary section is used for narrative one-liners.

## Step 5 — Send Slack DM to Greg (U01Q2UP4UJ3)
Write a narrative deploy message. Format:

🚀 *Staging Deploy Complete* • `<short_sha>` • <url>

<One sentence on the theme of the deploy.>

*<Group label>* (N PRs)
- <#PR as link> — <one plain-English sentence from the Summary>

*Contributors:* @author1, @author2, ...

*N PRs shipped*

Grouping: natural groups like "Versioning",
"Auth & Onboarding", "Infrastructure & Observability",
"Bug Fixes", "Docs". Skip for 1–3 PRs.
One-liners: from the PR Summary, not the title.
Plain English, one sentence.
Contributors: deduplicated user field from all PRs.
Include even if only one contributor.

## Step 6 — Save checkpoint
save_checkpoint: last_deploy_sha = head_sha,
last_deploy_at = workflow_run.updated_at.

## Error handling
DM Greg: "⚠️ Staging Deploy Summary agent failed to fetch
PR list. Workflow run: <url>"

## Notification Targets
Slack: U01Q2UP4UJ3

Who is ever going to write that by hand?

Webhook filtering logic, checkpoint management for tracking the last deploy, GitHub API calls to pull full PR bodies, Slack formatting with thematic grouping, error handling, notification target constraints. That’s not a config file. That’s a program. And Claude wrote it in the time it took me to say “yes” and “also add contributors.”

Greg: Does it need to pull the full body to get the summary?

Claude: Yes — the search results only return title, number, user, and URL. The Summary section lives in the PR body, which requires a separate github_get_pull_request call for each PR.

The only way to avoid the individual PR fetches entirely would be to write shorter, title-only one-liners — but that’s exactly what you didn’t want. The full body fetch is the price of the narrative quality.

I liked that answer. Claude didn’t just say “yes” — it explained the tradeoff and confirmed my choice was the right one. The full fetch costs a few extra tool calls per run, but that’s the whole point: I want the summary, not the list.

Greg: Good to go!

Claude: Staging Deploy Summary is running. Next deploy will get the narrative treatment — theme sentence, grouped PRs with one-liners from the summaries, and the contributors line at the bottom.

Why this matters

This entire interaction — diagnose the problem, find the root cause, rewrite the system prompt — took six minutes. I said maybe 30 words total.

In my previous life, if it were even possible, I’d have spent several 30-min+ sessions debugging it, over several days and likely throwing up my hands in despair.

With Claude as my design partner, backed by MCProspero, despair is replaced with ice cream (strawberry tonight).

-greg

Update: It worked

🚀 Staging Deploy Complete

This deploy centered on the final step of in-band tool versioning — enabling the server to detect stale clients and inject soft-reload notices into tool responses.

Versioning (1 PR)

#xxx Versioning step 4: notice injection in tool responses — Adds mcprospero_version parameter to all 17 MCP tools, wires version checking into _run_tool, and injects soft-reload notices when clients are stale; tool version bumped to 2.0.0 with comprehensive tests and baseline updates.

Contributors: @usmile1

1 PR shipped

Discuss on GitHub