Skip to content

Unconditional agent-output artifact download causes ENOENT noise on pre-agent failures #19474

@samuelkahessay

Description

@samuelkahessay

What happens

The safe-outputs job and conclusion job unconditionally add agent-output artifact download steps via buildAgentOutputDownloadSteps(). When the agent step fails before producing output (e.g., network failure, sandbox crash, permission error), the artifact doesn't exist. The download step is marked continue-on-error: true, so the job continues — but GH_AW_AGENT_OUTPUT is set to a path that doesn't exist. Downstream scripts that read that path emit ENOENT errors in the logs.

The failure issue title is always generic ([aw] <workflow> failed) with no pre-agent diagnostic context, making it hard to distinguish "agent never started" from "agent started and failed."

One improvement was landed recently: commit 31dc15f added inference-access-specific context in handle_agent_failure.cjs:461-468. But the broader pre-agent failure case (sandbox crash, network timeout, MCP server failure, etc.) still produces generic errors.

What should happen

  1. The download step's continue-on-error: true should be paired with a conditional check — downstream steps should skip when the artifact wasn't downloaded successfully
  2. GH_AW_AGENT_OUTPUT should only be set when the artifact actually exists
  3. Pre-agent failures should include the failure stage in the issue title (e.g., [aw] <workflow> failed (pre-agent) or [aw] <workflow> failed (sandbox setup))

Where in the code

All references are to main at 99b2107.

Unconditional download steps:

  • compiler_safe_outputs_job.go:53steps = append(steps, buildAgentOutputDownloadSteps()...)
  • notify_comment.go:57steps = append(steps, buildAgentOutputDownloadSteps()...) (conclusion job)

Download step with silent failure:

  • artifacts.go:44continue-on-error: true on the download step
  • artifacts.go:37-62buildAgentOutputDownloadSteps() sets env var regardless of download outcome

Downstream ENOENT paths:

  • load_agent_output.cjs:55-63 — reads process.env.GH_AW_AGENT_OUTPUT, attempts JSON.parse(fs.readFileSync(...)), catches ENOENT
  • noop.cjs:15-17 — calls loadAgentOutput(), returns silently on failure (no diagnostic)
  • notify_comment_error.cjs:79-88 — calls loadAgentOutput() in the error notification path

Generic failure title:

  • handle_agent_failure.cjs:593-596 — always uses [aw] <workflow> failed regardless of failure stage

Evidence

Source-level verification (2026-03-03):

  • Confirmed buildAgentOutputDownloadSteps() is called unconditionally at both call sites
  • Confirmed continue-on-error: true at artifacts.go:44
  • Confirmed no conditional guard on downstream env var usage

Local reproduction:

  • Ran noop.cjs with GH_AW_AGENT_OUTPUT pointing at a nonexistent file
  • Output: ENOENT error logged, then silent return — no indication of why the file is missing
  • Same behavior from notify_comment_error.cjs

Proposed fix

  1. In artifacts.go, add an id: to the download step and use a step outcome check (if: steps.<id>.outcome == 'success') on the env-setting step, so GH_AW_AGENT_OUTPUT is only set when the artifact was actually downloaded
  2. In handle_agent_failure.cjs, detect the failure stage (pre-agent vs. during-agent) by checking whether the agent-output artifact exists, and include this context in the failure issue title

Impact

Frequency: Every pre-agent failure. In our pipeline this occurs ~2-3 times per run batch when there are infrastructure issues (network, permissions, sandbox initialization).
Cost: Moderate — the ENOENT errors are noise in already-failing runs, but they obscure the real failure cause. The generic issue title means operators must dig into the full run log to determine whether the agent even started. Fixing this would significantly reduce triage time for pre-agent failures.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions