Claude Code / Voice of Customer Report V0
Cover · take-home submission

Daniel,

You said I'd have fun with this. You were right. It clicked the moment I read it as a product problem. Two questions framed the build:

  1. How can agents increase the actionability and routing of customer feedback?
  2. How does building on Anthropic's agent infrastructure change the answer?

So I approached it as a bet: could I build the agent infrastructure to deliver the assignment while clearing the bar other candidates are setting? The v0 you're reading is the result. Eager to hear your read.

Here's what you'll find on this page:

AssignmentSectionWhat's there
D1 · Categorized issue tracker§03 Tracker1,000 issues prioritized and categorized by actionability and impact
D2 · Themes synthesis§02 Themes12 stack-ranked themes, framed as unmet needs
D3 · Prioritization method§01 PrimerDataset, classification criteria, and the prioritization framework
D4 · Comms strategy§04 CommsResponder Agent MVP, built as a Claude skill
D5 · Internal validation plan§05 ValidationStakeholder discovery plan
D6 · Evergreen program proposal§06 EvergreenAntenna, proposal and landing page for the internal agentic product

The report is written for the people who'd actually use it: the Claude Code team and the researchers downstream. §06 is the only place the assignment shows up again, because the system that built this report is the proposal. Agents did the work. Point the same fleet at additional feedback sources, customer context, and product telemetry to widen the aperture and sharpen the signal on what's most actionable. That's the product.

Michael Nguyen / May 24, 2026

Claude Code Voice of Customer Report V0

The Claude Code team receives thousands of GitHub issues. The bottleneck isn't the signal. It's getting the signal to the right owner fast enough that it changes what ships.

This report is a stack-ranked read of 1,000 open issues, organized into 12 unmet user needs, 8 of them critical or high priority. The themes are the executive read. The tracker beneath is the per-issue operating layer, filterable by routing team and priority, with a JSON dump for downstream automation. Every theme cites its source. Every issue number links to GitHub.

PMs and EMs: §02 Themes is the brief. Engineering leads: §03 Tracker filtered to your team is what's yours. Researchers: Theme 9 ("Model drifts from instructions") is the frontier-trajectory candidate. Sixty seconds: the top-8 cards in §02.

01 · Primer

The primer.

Dataset

We analyzed 1,000 issues from Claude Code's official GitHub repository, queried by highest-engagement open issues over the last 60 days (reactions + comments + recency).

Classification criteria

The purpose was to understand accountability, actionability, and impact. Three orthogonal axes. We analyzed every issue against this classification criteria to help inform a prioritization.

Accountable

Which team owns it?

Understanding which team and decision maker is most suitable to address the customer need.

Surface
TUI Cowork-or-Sandbox Agent-View Tool-Extension-Contract Skills-or-Plugins IDE Auth Model-Behavior
Routing team
Harness Cowork TUI/UX Connectors-MCP IDE-Integration Auth & Billing Cost & Quota Async-Delegation Model-Behavior Docs
Actionable

How clear is the work?

The ability to reproduce the issue, or understand the user's unmet need, is critical for taking action.

Score
High Medium Low
Repro quality
clear-repro partial-repro no-repro feature-request signal-too-thin
Signal quality
coherent-narrative partial-context fragmentary confused-ask
Impact

How big and how durable?

Impact is determined by the user reach of the reported issue, plus how likely a solution is durable enough to outlast frontier model improvements. Note: reach is limited to GitHub data and engagement.

Score
High Medium Low
User reach
dominant widespread segment isolated
Frontier disposition
fix-now frontier-candidate insufficient-data

Priority framework

The classification criteria helps inform scores around Accountable, Actionable, and Impact to build this impact matrix. We aligned on high-impact and high-actionability as the bar for the top tier; priority is derived deterministically from the cells below.

Impact = High Impact = Medium Impact = Low
Actionable = High Critical High Medium
Actionable = Medium High Medium Low
Actionable = Low Defer-low-actionability   clarification reply, no human triage

Thematic analysis

A bottom-up thematic analysis across the full set of 1,000 issues produced 12 themes. Each theme is framed as an unmet user need, whether the underlying issue is a bug, a feature request, or misc feedback.

1,000 issues · 12 themes · ranked by share click any row to jump to its card below
Small things grind Users want the small things to work: font sizing in Code tab, scrollback in TUI, diff review in VS Code, Scala LSP, missing Create-PR button. "In the Claude Code CLI (TUI mode), older messages become invisible and cannot be scrolled back to, even before context compression kicks in."#28077 15.0%
Quota observability gap Users want to see what's burning their Max quota before it dies. "The 5-hour session window on Claude Max plan is being exhausted abnormally fast. With the exact same workload as previous days, the limit is now hit within 1-2 hours."#38335 13.3%
Sandbox + install fragility Users want the workspace to start, the install to complete, and the sandbox to respect their allowlist. "When opening Cowork in Claude Desktop, the workspace fails to start: 'VM service not running.' The error persists even after rebooting the computer."#27801 12.1%
Connectors lie about tools Users want the tools their connector says it provides to actually be usable. "When the GitHub Connector is enabled in Claude Desktop, Claude does not recognize or have access to any GitHub-related tools. Even after restarting the app."#32479 10.8%
Harness breaks the contract Users want the mode they selected to behave like the mode they selected. "The per-turn tool call limit has been significantly reduced without announcement. Sessions that ran 60-80+ tool calls now hit the limit after ~20."#33969 9.0%
Permissions don't match approval Users want their allow-rules and bypass settings to actually skip prompts. "When Claude Code generates a compound bash command using &&, the permission approval prompt incorrectly identifies cd as the action requiring approval."#28240 7.2%
Daily-driver parity Users want their sessions, history, windows and settings to follow them across CLI, desktop and mobile. "Claude Code Desktop should support opening multiple windows within a single app instance. The only workaround launches a second instance, doubling memory usage."#30154 6.5%
Delegation drops work Users want to delegate work and walk away, then pick it up on phone or desktop. "Remote Control silently drops the connection to the Claude iOS app during long sessions. The mobile client becomes completely unresponsive."#34255 6.4%
Auth front door broken Users want to sign in, upgrade, or stay subscribed without their account silently breaking. "Many users have problems signing up to a new account, they get stuck on phone verification: 'Unable to send verification code to this number.'"#34229 5.8%
Model drifts from instructions Users want CLAUDE.md hard rules and persistent memory to actually constrain behavior. "Since Opus 4.6, I've experienced a severe and consistent quality regression. Claude gets stuck in circular exploration patterns, reading files it already read."#28469 5.4%
Cross-surface command parity Users want slash commands, scrollback, queue-instead-of-interrupt, and modern text-input to work the same in VS Code, web and mobile as they do in the CLI. "The /btw command works in the terminal CLI but is not available in the VS Code extension. Would be great to have parity so it works the same way."#37323 2.7%
Bedrock beyond CLI Users want Desktop, Cowork, remote-control and Auto-mode to ride their AWS Bedrock or Azure provider. "Corporate network security tools block claude.ai while allowing AWS service endpoints. CLI works perfectly via Bedrock, but Desktop/Cowork has no equivalent."#32668 0.8%
Long-tail (no theme assignment) Issues that didn't fit any of the 12 themes, kept in the count for completeness. 5.0%
02 · Themes

The findings.

12 unmet user needs, ranked by share of the 1,000-issue sample. The top 8 are Critical or High priority on actionability and impact, and earn detailed cards below. Themes 9–12 are in the honorable-mentions table at the end of the section.

Each card carries the unmet need, the representative quote, reach (% of sample), accountable team, and actionability grade. The numbered ranking is the recommended order of attention; the priority tag is the work tier.

If you have ten minutes, read the top 3. If you have two, read the headlines.

1
Critical
Small things grind
The broken misc: fonts, IDE parity, scrollback, accessibility, the small things that grind
Users want the small things to work: font sizing in Code tab, scrollback in TUI, diff review in VS Code, Scala LSP, missing Create-PR button. None of them is dramatic. All of them shape daily use.
"In the Claude Code CLI (TUI mode), older messages become invisible and cannot be scrolled back to, even before context compression kicks in. The terminal's own scrollback buffer has no effect."#28077
Reach: 15.0% · Accountable: TUI/UX team (primary); Cowork + IDE-Integration (secondary) · Actionability: High
150 issuesDetails
All 150 issues in this theme
2
Critical
Sandbox + install fragility
Cowork, sandbox and installation don't survive my environment
Users want the workspace to start, the install to complete, and the sandbox to respect their allowlist. Right now first-five-minutes failure blocks them before they can even file a clean bug.
"When opening Cowork in Claude Desktop, the workspace fails to start with: 'Failed to start Claude's workspace, VM service not running.' The error persists even after rebooting."#27801
Reach: 12.1% · Accountable: Cowork team · Actionability: High
121 issuesDetails
All 121 issues in this theme
3
High
Quota observability gap
Quota dies fast on paid tiers, no way to tell what spent it
Users want to see what's burning their Max quota before it dies. Right now sessions end in 1.5 hours with no breakdown of what spent it.
"Since March 23, 2026, the 5-hour session window on Claude Max plan is being exhausted abnormally fast. With the exact same workload and prompts as previous days, the usage limit is now hit within 1-2 hours instead of the usual full 5-hour window."#38335
Reach: 13.3% · Accountable: Cost & Quota team · Actionability: Medium
133 issuesDetails
All 133 issues in this theme
4
High
Connectors lie about tools
Connectors say connected but the tools aren't actually there
Users want the tools their connector says it provides to actually be usable. Right now GitHub, Drive, Gmail, Slack and MCP servers show green and expose nothing.
"When the GitHub Connector is enabled in Claude Desktop, Claude does not recognize or have access to any GitHub-related tools. Even after restarting the app, Claude responds as if no connector is connected at all."#32479
Reach: 10.8% · Accountable: Connectors-MCP team · Actionability: High
108 issuesDetails
All 108 issues in this theme
5
High
Harness breaks the contract
The harness silently breaks the model's advertised contract
Users want the mode they selected to behave like the mode they selected. Right now plan mode edits files, tool-call limits regress mid-version, and subagents don't inherit the project context the parent already loaded.
"The per-turn tool call limit in Claude Desktop has been significantly reduced without announcement. Sessions that previously ran 60-80+ tool calls autonomously over 45+ minutes now hit 'Claude reached its tool-use limit for this turn' after approximately 20 tool calls."#33969
Reach: 9.0% · Accountable: Harness team · Actionability: High
90 issuesDetails
All 90 issues in this theme
6
High
Permissions don't match approval
The permission system doesn't match what I actually approved
Users want their allow-rules and bypass settings to actually skip prompts. Right now compound bash, plan mode, and dangerously-skip-permissions all leak prompts they already approved.
"When Claude Code generates a compound bash command using && (e.g.: cd /some/path && git add file && git commit -m 'msg'), the permission approval prompt incorrectly identifies cd as the action requiring approval, showing a cd:* permission request."#28240
Reach: 7.2% · Accountable: Harness team · Actionability: High
72 issuesDetails
All 72 issues in this theme
7
High
Daily-driver parity
I want one tool to be my daily driver, let me drive it like one
Users want their sessions, history, windows and settings to follow them across CLI, desktop and mobile. Right now they're three siloed apps that don't share state.
"Claude Code Desktop should support opening multiple windows within a single app instance. The only way to view two sessions side by side is to launch a second app instance via 'open -n -a Claude', which doubles memory usage."#30154
Reach: 6.5% · Accountable: Harness (primary); Cowork (secondary) · Actionability: High
65 issuesDetails
All 65 issues in this theme
8
High
Delegation drops work
Async delegation surfaces drop work while I'm away, with no way to recover
Users want to delegate work and walk away, then pick it up on phone or desktop. Right now remote-control silently drops, sessions don't reconnect, and accounts don't switch.
"Remote Control silently drops the connection to the Claude iOS app during long sessions. Once dropped, the mobile client becomes completely unresponsive. The built-in reconnection doesn't work. The only fix is to physically walk to the terminal."#34255
Reach: 6.4% · Accountable: Async-Delegation team · Actionability: High
64 issuesDetails
All 64 issues in this theme

Honorable mentions (themes 9–12)

Real signal, lower priority on actionability and impact than the top 8. Listed here in stack-rank order with the same metadata.

Priority Theme User need Issues Reach Accountable
9Frontier-escalateModel drifts from instructionsUsers want CLAUDE.md hard rules and persistent memory to actually constrain behavior.545.4%Model-Behavior (primary); Harness (secondary)
10MediumAuth front door brokenUsers want to sign in, upgrade, or stay subscribed without their account silently breaking.585.8%Auth & Billing team
11MediumCross-surface command parityUsers want slash commands, scrollback, queue-instead-of-interrupt, and modern text-input to work the same in VS Code, web and mobile as they do in the CLI.272.7%TUI/UX team
12LowBedrock beyond CLIUsers want Desktop, Cowork, remote-control and Auto-mode to ride their AWS Bedrock or Azure provider.80.8%Auth & Billing team
03 · Tracker

The operating layer.

1,000 issues, scored on accountability, actionability, and impact, and sorted by recommended priority. Filter by priority, routing team, or theme. Issue numbers link to GitHub. The full tracker exports as JSON at the bottom of the grid, for downstream agents to consume.

1000 rows shown · download as JSON
04 · Comms

How we'd close the loop.

A responder agent built as a Claude skill, with a hard gate: no message sends without human review. Below: the operating principles, plus three example replies drafted against real GitHub issues.

The approach

A responder agent that handles each issue-disposition type from the tracker. Trusted Claude developers review and override the drafts; the agent improves on every override. Inspired by Warp.dev's community response agent (h/t Petra Donka, Head of DX).

V1 is built as a Claude skill, on three principles:

  1. Every reply is a signal-acquisition move. The reply asks for a missing diagnostic, a version number, a region; confirms the user was heard; or builds trust over time. If it doesn't do one of those, it's noise.
  2. Match response shape to issue shape. A bug with a clean repro and a regression bisection gets a status update. A feature request gets a thank-and-track. A frontier-trajectory complaint gets escalated, not closed. Five voice registers, each tied to a state transition.
  3. Protect the channel, not the throughput. Silent transitions, the invalid label closing a real bug, the long stall after a high-quality report: that's how channels die. Every state change fires an explicit comms event. Nothing sends without human approval.

Channels, cadence, tone, expectations

Channels
GitHub holds the durable feedback, so the responder agent ships into the issue thread directly. X is the amplifier: the @claudedev account points users back to GitHub when an issue is the right shape to file, keeping signal density on the durable channel.
Cadence
State-driven, not calendar-driven. Comms fire when an issue changes state, not on a weekly digest. Four trigger events:
  • Acknowledgment when a new issue lands in a recognized theme.
  • Status update on a state change (queued, escalated, deduplicated to a parent).
  • Action-taken when a fix ships, or when the team decides not to fix.
  • Already-shipped pre-check that points the user to the changelog when the requested feature already exists.
Tone
Warm about the engineering work users put in. Humble about what slipped through. Honest about what the team can and can't commit to. Empty thanks like "we really appreciate you being part of the Claude Code journey" are banned. Specific gratitude tied to what the user actually did is encouraged. No cheerleading. The discipline is user satisfaction, not Anthropic looking good.
Expectations
Every issue is read. Not every issue gets a reply, and not every issue gets actioned. The bar for an in-thread reply: a state transition (acknowledged, queued, escalated, shipped, declined), or membership in the top-10 leverage bucket. No timeline promises. State names like "in our up-next queue" or "behind the auth work currently shipping" instead of dates. When the team decides not to fix, the user hears that out loud, with the reasoning, and an invitation to keep filing.

Three example comms, mapped to real issues

Drafted against real backlog issues, across different states and labels.

GITHUB · anthropics/claude-code · #34229 Acknowledgment
close-the-loop · Responder agent Draft · awaiting approval

Thanks for filing this, and for the reactions and comments here, which made it clear this isn't a one-off.

This sits in the auth-and-account cluster (we're tracking 58 open issues in this shape over the last 90 days). The team that owns it isn't the Claude Code repo maintainers, which is why the in-thread reply has been thin; we're routing this to the auth oncall and the support team and removing the invalid label because that label was wrong here.

A real status update will land on this issue when the team has a concrete read. If you've hit this and have additional context (region, carrier, browser), a comment with that detail makes the routing tighter.

Drafted Human approval Posted
Trigger Fires when a new issue lands in a recognized theme. Replaces the silent invalid-label pattern with a useful one-paragraph state read.
GITHUB · anthropics/claude-code · #38335 Status update
close-the-loop · Responder agent Draft · awaiting approval

Status update. This issue is one of 135 we're tracking under the quota-observability theme. It's now in our up-next queue with two adjacent issues: #46917 (cache_creation regression, with the byte-counted proxy evidence) and #41788 (Max 20 reset behavior changed since v2.1.89).

What's happening on our side: the harness team is auditing the token-accounting path end-to-end, and the product team is scoping a per-request usage breakdown in /usage so the meter is legible without an external proxy.

We're not promising a date because the audit is what determines the fix size. We'll comment again when the audit reports back. If you have additional bisection evidence (CLI versions, plan tier, time of day), please add it here. That's the input that shortens the audit.

Drafted Human approval Posted
Trigger Fires when an issue transitions state (queued, declined, deduplicated, escalated). Uses state names, not dates.
GITHUB · anthropics/claude-code · #46917 Action taken
close-the-loop · Responder agent Draft · awaiting approval

Update. The cache_creation regression you bisected to v2.1.100 is fixed in v2.1.118, released today. The root cause was a prompt-builder change that re-serialized a tool definition once per turn instead of once per session; the byte delta you measured (978 fewer bytes, 20,196 more tokens) was exactly the cache-miss signature.

Your proxy-instrumented evidence is what made this fast. The team had a vague "cost feels off" signal from internal telemetry but couldn't localize it; your bisection narrowed the search to a single deploy window.

Two things changed downstream of this issue: (1) we added per-tool cache-creation accounting to the harness debug log so the next regression of this shape is detectable without a proxy, and (2) we're folding cache_creation deltas into the pre-release regression test suite. Thanks for the engineering work.

Drafted Human approval Posted
Trigger Fires when a fix ships or when the team decides not to fix and wants to say so out loud. Closes the loop, credits the evidence that made it possible.
05 · Validation

How we'd pressure-test this.

Before this analysis goes wider, four internal cohorts pressure-test the three things the framework rests on: accountability, actionability, and impact. The dataset is public GitHub, so the validation stays inside the building.

Team What I'd want their input on What they'd see that I'd miss
Embedded Product Ops The accountability map: which teams own which surfaces, where the seams are. How routing happens today, what lands where, what dies in transit. Which teams to pressure-test the analysis with first. The politics, the inputs already in flight, and prior attempts at programs like this.
Claude Code product-decision pair (PM + EM) Signal shape: theme memo, evidence cards, dashboard? The impact rubric they actually use when deciding what to fix. Whether the routing calls match how they own the surface. Roadmap inertia, fix paths that don't appear in issue text, and the gap between what a public issue says and what the team knows is actually broken.
Harness engineering lead Evidence density and altitude (issue, theme, or surface?). How many cited verbatims earn trust without becoming noise. Whether the autonomous-fixability grades reflect what's actually buildable. Engineering trade-offs that re-route work between teams, and harness-side fix paths that look like one cluster from outside but split into three from inside.
Model researcher Whether the frontier-trajectory cuts hold up against current eval pipelines. Which themes are eval-candidate-shaped vs. constitution-shaped. Eval-pipeline shape, model-trajectory expectations, and which behavioral patterns are training-loop targets vs. capability gaps.

Sequence top to bottom across weeks 1–3, starting with Product Ops. They have the cross-surface visibility to name who else should be in the room, so every conversation after theirs runs tighter.

06 · Evergreen

How this becomes evergreen.

This submission was built by a small Claude agent team. The team is also the answer: it's the v0 of the program, addressable on demand, and the Antenna page below is what it looks like as an internal product.

The work product is the proof of concept. I built a small multi-agent customer intelligence team to complete this assignment, and that team is the lightweight process. The submission you're reading is what one on-demand run produces. Run it on a planning cadence, layer continuous ingestion behind it, and the same machinery becomes the evergreen program.

Cadence, ownership, tooling, integration

Cadence

Continuous, with on-demand synthesis on top. Continuous: every new GitHub issue, plus signal from Reddit, X, and the App Store, gets ingested, classified, and routed in near-real-time. On-demand: synthesis runs (themes, prioritization, comms drafts) trigger on a planning cycle or a PM/EM ask. The pilot you're reading is one such on-demand run.

Ownership

Built and maintained by Product Operations. Open-sourced internally to all Anthropic builders to contribute, extend, or fork for their own surface. PMOM is the steward; the agents are shared infrastructure that any team can pull from instead of standing up its own.

Tooling

Built on the Anthropic stack: Claude Managed Agents for orchestration, Plugins for cross-surface distribution, and Skills for composable methodology. Enabled in Claude.ai, Cowork, and Claude Code, so any PM, EM, or researcher can invoke the right agent from whatever surface they already live in.

Integration with planning

Outputs land where decisions are already made: a weekly themes brief in the PM/EM operating cadence, prioritization sliders inside the planning doc, draft comms in the close-the-loop queue. The agents don't ask the team to adopt new rituals; they slot into the ones that already exist.

What it looks like as an internal product

The Antenna landing page below is the program's front door, the surface where any Anthropic PM, EM, or researcher pulls a synthesis on demand. The agents behind it are the same ones that produced this report; the difference is that here they're addressable as a product, not a one-off engagement.

Antenna

Find the signal. Build the right thing next.

Meet the team