feature: Errors page by matt-aitken · Pull Request #3172 · triggerdotdev/trigger.dev

matt-aitken · 2026-03-04T15:20:46Z

A top-level Errors page that aggregates errors from failed runs with occurrences metrics.

feature-errors-mvp-compressed-louder.mp4

Errors are “fingerprinted” so similar errors are grouped together (e.g. has an ID in the error message).

You can view an individual error to view a timeline of when it fired, the runs, and bulk replay them.

changeset-bot · 2026-03-04T15:21:01Z

⚠️ No Changeset found

Latest commit: 1b339ed

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

coderabbitai · 2026-03-04T15:21:13Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds end-to-end error analytics and optional error fingerprinting: a fingerprinting utility and normalization; optional fingerprint computation in the runs replication service; schema migrations and new ClickHouse tables/materialized views (error aggregations and occurrences) plus query/utility APIs; server presenters for errors list and error-group detail; new Remix routes and UI for Errors index and detail pages; RunFilters, TaskRunsTable, and search input updates to support errorId filtering; path helpers, semver and time-granularity utilities; ClickHouse query builder HAVING support; tests, benchmarks, and related type/interface updates.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description explains what was built (a top-level Errors page with fingerprinting and individual error views) but does not follow the required template structure (missing checklist, Testing, and Changelog sections).	Complete the PR description using the repository template: add the checklist, Testing section describing how the feature was tested, and a Changelog section summarizing the changes.
Docstring Coverage	⚠️ Warning	Docstring coverage is 29.23% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feature: Errors page' clearly describes the main addition to the codebase—a new Errors page feature for aggregating and managing errors from failed runs.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature-errors

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

apps/webapp/app/components/primitives/DateTime.tsx (1)

377-390: Avoid initial blank text and duplicate update path.

At Line 377 the state starts as "", then Line 388-390 immediately sets it. Consider initializing state from getRelativeText(realDate) and collapsing to a single effect to remove the empty first paint and extra update cycle.

♻️ Proposed simplification

-  const [relativeText, setRelativeText] = useState("");
+  const [relativeText, setRelativeText] = useState(() => getRelativeText(realDate));

-  // Every 60s refresh
-  useEffect(() => {
-    const interval = setInterval(() => {
-      setRelativeText(getRelativeText(realDate));
-    }, 60_000);
-    return () => clearInterval(interval);
-  }, [realDate]);
-
-  // On first render
-  useEffect(() => {
-    setRelativeText(getRelativeText(realDate));
-  }, [realDate]);
+  useEffect(() => {
+    setRelativeText(getRelativeText(realDate));
+    const interval = setInterval(() => {
+      setRelativeText(getRelativeText(realDate));
+    }, 60_000);
+    return () => clearInterval(interval);
+  }, [realDate]);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/components/primitives/DateTime.tsx` around lines 377 - 390,
Initialize the relativeText state using getRelativeText(realDate) instead of an
empty string and replace the two effects with a single useEffect that (1) sets
setRelativeText(getRelativeText(realDate)) immediately when realDate changes and
(2) starts the 60s interval that updates
setRelativeText(getRelativeText(realDate)) and is cleaned up on unmount; update
references to relativeText, setRelativeText, getRelativeText and the interval
logic accordingly so there is no initial blank render or duplicated update path.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@apps/webapp/app/components/primitives/DateTime.tsx`:
- Around line 377-390: Initialize the relativeText state using
getRelativeText(realDate) instead of an empty string and replace the two effects
with a single useEffect that (1) sets setRelativeText(getRelativeText(realDate))
immediately when realDate changes and (2) starts the 60s interval that updates
setRelativeText(getRelativeText(realDate)) and is cleaned up on unmount; update
references to relativeText, setRelativeText, getRelativeText and the interval
logic accordingly so there is no initial blank render or duplicated update path.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ee2d7330-486f-49ed-96f8-1dcc497045c6

📥 Commits

Reviewing files that changed from the base of the PR and between 0818772 and 6a92df3.

📒 Files selected for processing (1)

apps/webapp/app/components/primitives/DateTime.tsx

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)

GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
GitHub Check: sdk-compat / Cloudflare Workers
GitHub Check: sdk-compat / Bun Runtime
GitHub Check: sdk-compat / Node.js 20.20 (ubuntu-latest)
GitHub Check: sdk-compat / Node.js 22.12 (ubuntu-latest)
GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
GitHub Check: sdk-compat / Deno Runtime
GitHub Check: typecheck / typecheck
GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
GitHub Check: Analyze (javascript-typescript)

🧰 Additional context used

📓 Path-based instructions (7)

**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead

**/*.{ts,tsx}: In TypeScript SDK usage, always import from @trigger.dev/sdk, never from @trigger.dev/sdk/v3 or use deprecated client.defineJob
Import from @trigger.dev/core subpaths only, never from the root
Use the Run Engine 2.0 (@internal/run-engine) and redis-worker for all new work, not legacy V1 MarQS queue or deprecated V1 functions

Files:

apps/webapp/app/components/primitives/DateTime.tsx

{packages/core,apps/webapp}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use zod for validation in packages/core and apps/webapp

Files:

apps/webapp/app/components/primitives/DateTime.tsx

**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use function declarations instead of default exports

Files:

apps/webapp/app/components/primitives/DateTime.tsx

apps/webapp/app/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

Access all environment variables through the env export of env.server.ts instead of directly accessing process.env in the Trigger.dev webapp

Files:

apps/webapp/app/components/primitives/DateTime.tsx

apps/webapp/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

apps/webapp/**/*.{ts,tsx}: When importing from @trigger.dev/core in the webapp, use subpath exports from the package.json instead of importing from the root path
Follow the Remix 2.1.0 and Express server conventions when updating the main trigger.dev webapp

Files:

apps/webapp/app/components/primitives/DateTime.tsx

**/*.{js,ts,jsx,tsx,json,md,yaml,yml}

📄 CodeRabbit inference engine (AGENTS.md)

Format code using Prettier before committing

Files:

apps/webapp/app/components/primitives/DateTime.tsx

apps/**/*.{ts,tsx,js}

📄 CodeRabbit inference engine (CLAUDE.md)

When modifying only server components (apps/webapp/, apps/supervisor/, etc.) with no package changes, add a .server-changes/ file instead of a changeset

Files:

apps/webapp/app/components/primitives/DateTime.tsx

🧠 Learnings (2)

📚 Learning: 2025-11-27T16:27:35.304Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Use `useRun`, `useRealtimeRun` and other SWR/realtime hooks from `trigger.dev/react-hooks` for data fetching

Applied to files:

apps/webapp/app/components/primitives/DateTime.tsx

📚 Learning: 2026-02-11T16:37:32.429Z

Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3019
File: apps/webapp/app/components/primitives/charts/Card.tsx:26-30
Timestamp: 2026-02-11T16:37:32.429Z
Learning: In projects using react-grid-layout, avoid relying on drag-handle class to imply draggability. Ensure drag-handle elements only affect dragging when the parent grid item is configured draggable in the layout; conditionally apply cursor styles based on the draggable prop. This improves correctness and accessibility.

Applied to files:

apps/webapp/app/components/primitives/DateTime.tsx

🔇 Additional comments (1)

apps/webapp/app/components/primitives/DateTime.tsx (1)

379-385: Nice fix for live relative-time updates.

The interval refresh with proper cleanup keeps RelativeDateTime current and avoids the previously frozen output issue.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

apps/webapp/test/runsReplicationBenchmark.README.md (2)

126-201: ⚠️ Potential issue | 🟡 Minor

Fenced code block missing language identifier (duplicate)

The output example code block lacks a language identifier, triggering MD040. Adding text will satisfy markdownlint and improve rendering.

📝 Proposed fix

-```
+```text
 ================================================================================
 BENCHMARK: baseline-no-fingerprinting

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/test/runsReplicationBenchmark.README.md` around lines 126 - 201,
The fenced code block in the README sample lacks a language identifier, causing
MD040; update the opening fence for the large benchmark example (the block that
begins with ``` and the line
"================================================================================"
/ "BENCHMARK: baseline-no-fingerprinting") to include a language token such as
text (e.g., replace ``` with ```text) so markdownlint passes and rendering is
explicit.

15-31: ⚠️ Potential issue | 🟡 Minor

Fenced code block missing language identifier (duplicate)

The ASCII diagram code block lacks a language identifier, triggering MD040. Adding text will satisfy markdownlint and improve rendering.

📝 Proposed fix

-```
+```text
 ┌─────────────────┐         ┌──────────────────────┐
 │  Producer       │         │  Benchmark Test      │

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/test/runsReplicationBenchmark.README.md` around lines 15 - 31,
The fenced ASCII diagram block in
apps/webapp/test/runsReplicationBenchmark.README.md is missing a language
identifier which triggers MD040; update the opening fence(s) for the diagram
(the triple-backtick block containing the ASCII art) to include a language
identifier such as "text" so the block becomes ```text (and do this for any
duplicate diagram blocks), leaving the diagram content unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal-packages/clickhouse/schema/022_create_errors_v1_table.sql`:
- Around line 65-68: The WHERE clause that builds the errors aggregation
currently filters status IN ('SYSTEM_FAILURE', 'CRASHED', 'INTERRUPTED',
'COMPLETED_WITH_ERRORS') and omits two error-related final statuses; update that
filter to also include 'EXPIRED' and 'TIMED_OUT' so all failed TaskRunStatus
values are captured (locate the WHERE block referencing error_fingerprint and
status in the errors table creation SQL and add 'EXPIRED' and 'TIMED_OUT' to the
IN list).

---

Duplicate comments:
In `@apps/webapp/test/runsReplicationBenchmark.README.md`:
- Around line 126-201: The fenced code block in the README sample lacks a
language identifier, causing MD040; update the opening fence for the large
benchmark example (the block that begins with ``` and the line
"================================================================================"
/ "BENCHMARK: baseline-no-fingerprinting") to include a language token such as
text (e.g., replace ``` with ```text) so markdownlint passes and rendering is
explicit.
- Around line 15-31: The fenced ASCII diagram block in
apps/webapp/test/runsReplicationBenchmark.README.md is missing a language
identifier which triggers MD040; update the opening fence(s) for the diagram
(the triple-backtick block containing the ASCII art) to include a language
identifier such as "text" so the block becomes ```text (and do this for any
duplicate diagram blocks), leaving the diagram content unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 910e5089-06e8-4f64-b451-a6ae55d622d3

📥 Commits

Reviewing files that changed from the base of the PR and between 8023370 and 50052a9.

📒 Files selected for processing (4)

apps/webapp/test/runsReplicationBenchmark.README.md
apps/webapp/test/runsReplicationBenchmark.test.ts
internal-packages/clickhouse/schema/022_create_errors_v1_table.sql
internal-packages/clickhouse/schema/023_create_error_occurrences_v1.sql

🚧 Files skipped from review as they are similar to previous changes (2)

apps/webapp/test/runsReplicationBenchmark.test.ts
internal-packages/clickhouse/schema/023_create_error_occurrences_v1.sql

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)

GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
GitHub Check: sdk-compat / Node.js 20.20 (ubuntu-latest)
GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
GitHub Check: sdk-compat / Node.js 22.12 (ubuntu-latest)
GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
GitHub Check: sdk-compat / Cloudflare Workers
GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
GitHub Check: sdk-compat / Bun Runtime
GitHub Check: typecheck / typecheck
GitHub Check: sdk-compat / Deno Runtime
GitHub Check: Analyze (javascript-typescript)

🧰 Additional context used

📓 Path-based instructions (2)

internal-packages/clickhouse/schema/*.sql

📄 CodeRabbit inference engine (internal-packages/clickhouse/CLAUDE.md)

ClickHouse SQL migrations should follow Goose format with Up and Down sections, stored as numbered files in the schema/ directory

Files:

internal-packages/clickhouse/schema/022_create_errors_v1_table.sql

**/*.{js,ts,jsx,tsx,json,md,yaml,yml}

📄 CodeRabbit inference engine (AGENTS.md)

Format code using Prettier before committing

Files:

apps/webapp/test/runsReplicationBenchmark.README.md

🧠 Learnings (9)

📚 Learning: 2026-03-02T12:43:08.166Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: internal-packages/clickhouse/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:08.166Z
Learning: Applies to internal-packages/clickhouse/schema/*.sql : ClickHouse SQL migrations should follow Goose format with Up and Down sections, stored as numbered files in the `schema/` directory

Applied to files:

internal-packages/clickhouse/schema/022_create_errors_v1_table.sql

📚 Learning: 2026-03-02T12:43:08.166Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: internal-packages/clickhouse/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:08.166Z
Learning: Use `_mv_v1` suffix for ClickHouse materialized views

Applied to files:

internal-packages/clickhouse/schema/022_create_errors_v1_table.sql

📚 Learning: 2026-03-02T12:43:08.166Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: internal-packages/clickhouse/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:08.166Z
Learning: Use `_v1`, `_v2` suffixes for ClickHouse table versioning

Applied to files:

internal-packages/clickhouse/schema/022_create_errors_v1_table.sql

📚 Learning: 2026-03-02T12:43:02.539Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: docs/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:02.539Z
Learning: Applies to docs/**/*.mdx : Use language tags in code fences: `typescript`, `bash`, `json`

Applied to files:

apps/webapp/test/runsReplicationBenchmark.README.md

📚 Learning: 2026-03-02T12:43:25.254Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: internal-packages/run-engine/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:25.254Z
Learning: Applies to internal-packages/run-engine/src/engine/tests/**/*.test.ts : Implement tests for RunEngine in `src/engine/tests/` using testcontainers for Redis and PostgreSQL containerization

Applied to files:

apps/webapp/test/runsReplicationBenchmark.README.md

📚 Learning: 2026-03-03T13:07:33.177Z

Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3166
File: internal-packages/run-engine/src/batch-queue/tests/index.test.ts:711-713
Timestamp: 2026-03-03T13:07:33.177Z
Learning: In `internal-packages/run-engine/src/batch-queue/tests/index.test.ts`, test assertions for rate limiter stubs can use `toBeGreaterThanOrEqual` rather than exact equality (`toBe`) because the consumer loop may call the rate limiter during empty pops in addition to actual item processing, and this over-calling is acceptable in integration tests.

Applied to files:

apps/webapp/test/runsReplicationBenchmark.README.md

📚 Learning: 2026-03-02T12:42:41.110Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-02T12:42:41.110Z
Learning: Applies to **/*.test.{ts,tsx,js} : Test files using Redis or PostgreSQL should use testcontainers helpers (redisTest, postgresTest, containerTest) instead of mocks

Applied to files:

apps/webapp/test/runsReplicationBenchmark.README.md

📚 Learning: 2026-03-02T12:42:56.114Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: apps/webapp/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:42:56.114Z
Learning: Applies to apps/webapp/app/v3/services/**/*.server.ts : When editing services that branch on `RunEngineVersion` to support both V1 and V2 (e.g., `cancelTaskRun.server.ts`, `batchTriggerV3.server.ts`), only modify V2 code paths

Applied to files:

apps/webapp/test/runsReplicationBenchmark.README.md

📚 Learning: 2026-01-15T10:48:02.687Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-15T10:48:02.687Z
Learning: Applies to **/*.test.{ts,tsx,js,jsx} : Tests should avoid mocks or stubs and use the helpers from `internal/testcontainers` when Redis or Postgres are needed

Applied to files:

apps/webapp/test/runsReplicationBenchmark.README.md

🪛 LanguageTool

apps/webapp/test/runsReplicationBenchmark.README.md

[uncategorized] ~80-~80: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...icationBenchmark.test.ts --run ### High Volume Benchmark bash cd apps/webapp BENCH...

(EN_COMPOUND_ADJECTIVE_INTERNAL)

🪛 markdownlint-cli2 (0.21.0)

apps/webapp/test/runsReplicationBenchmark.README.md

[warning] 15-15: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

[warning] 126-126: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (2)

internal-packages/clickhouse/schema/022_create_errors_v1_table.sql (2)

1-36: LGTM! Table structure follows conventions and is well-designed.

The table uses appropriate patterns:

AggregatingMergeTree with proper ORDER BY matching the GROUP BY key

SimpleAggregateFunction for timestamp bounds (avoids binary state encoding)

AggregateFunction with proper state functions for counts and samples

90-day TTL on last_seen_date for data lifecycle management

The String columns (error_type, error_message, sample_stack_trace) will have non-deterministic values after part merges, but this is acceptable since they represent sample/representative data from the error group.

76-78: LGTM! Down migration is correct.

The migration properly drops the materialized view before the table (correct dependency order) and uses IF EXISTS for idempotency.

coderabbitai · 2026-03-04T17:38:50Z

internal-packages/clickhouse/schema/022_create_errors_v1_table.sql

+WHERE
+  error_fingerprint != ''
+  AND status IN ('SYSTEM_FAILURE', 'CRASHED', 'INTERRUPTED', 'COMPLETED_WITH_ERRORS')
+  AND _is_deleted = 0


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find status enum/type definitions echo "=== Searching for status type definitions ===" rg -n 'enum.*Status|type.*Status|const.*Status' --type=ts --type=js -g '!*test*' -g '!*spec*' | head -100 echo "" echo "=== Searching for status string literals in task context ===" rg -n "(SYSTEM_FAILURE|CRASHED|INTERRUPTED|COMPLETED_WITH_ERRORS|FAILURE|FAILED|TIMED_OUT)" --type=ts --type=js | grep -i 'status\|enum' | head -50 echo "" echo "=== Check the SQL file structure and format ===" cat -n internal-packages/clickhouse/schema/022_create_errors_v1_table.sql | head -30 echo "" echo "=== Check for Up/Down sections (Goose format) ===" grep -n "-- +goose Up\|-- +goose Down" internal-packages/clickhouse/schema/022_create_errors_v1_table.sql

Repository: triggerdotdev/trigger.dev

Length of output: 19301

🏁 Script executed:

#!/bin/bash # Find and display the TaskRunStatus enum definition echo "=== TaskRunStatus enum definition ===" sed -n '22,42p' packages/core/src/v3/schemas/runEngine.ts echo "" echo "=== finalStatuses definition ===" sed -n '44,52p' internal-packages/run-engine/src/engine/statuses.ts

Repository: triggerdotdev/trigger.dev

Length of output: 1034

Add EXPIRED and TIMED_OUT to the status filter in the errors table.

The current filter includes SYSTEM_FAILURE, CRASHED, INTERRUPTED, and COMPLETED_WITH_ERRORS, but it's missing EXPIRED and TIMED_OUT, which are defined as error-related final statuses in the TaskRunStatus enum. These should be included to ensure all failed runs are aggregated into the errors table.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal-packages/clickhouse/schema/022_create_errors_v1_table.sql` around lines 65 - 68, The WHERE clause that builds the errors aggregation currently filters status IN ('SYSTEM_FAILURE', 'CRASHED', 'INTERRUPTED', 'COMPLETED_WITH_ERRORS') and omits two error-related final statuses; update that filter to also include 'EXPIRED' and 'TIMED_OUT' so all failed TaskRunStatus values are captured (locate the WHERE block referencing error_fingerprint and status in the errors table creation SQL and add 'EXPIRED' and 'TIMED_OUT' to the IN list).

devin-ai-integration

Devin Review found 4 potential issues.

View 5 additional findings in Devin Review.

devin-ai-integration · 2026-03-04T23:05:17Z

apps/webapp/app/services/runsReplicationService.server.ts

🚩 Missing .server-changes file for server-only changes

Per CONTRIBUTING.md and CLAUDE.md, PRs that change server components (apps/webapp/) without package changes should include a .server-changes/ file for release notes. This PR adds significant new server functionality (error fingerprinting, new routes, ClickHouse schema) but I don't see a corresponding .server-changes/ markdown file. There are also published package changes (new ErrorId in packages/core), so a changeset may be more appropriate — but neither appears to be present.

Was this helpful? React with 👍 or 👎 to provide feedback.

...pp.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors.$fingerprint/route.tsx

apps/webapp/app/components/primitives/DateTime.tsx

devin-ai-integration · 2026-03-04T23:05:25Z

apps/webapp/app/services/runsReplicationService.server.ts

+    const errorFingerprint = (
+      !this._disableErrorFingerprinting &&
+      ['SYSTEM_FAILURE', 'CRASHED', 'INTERRUPTED', 'COMPLETED_WITH_ERRORS'].includes(run.status)
+    )
+      ? calculateErrorFingerprint(run.error)
+      : '';


🚩 Error fingerprinting only covers 4 of the failure statuses

The fingerprinting logic at apps/webapp/app/services/runsReplicationService.server.ts:864 and the ClickHouse materialized views (022_create_errors_v1_table.sql:67, 023_create_error_occurrences_v1.sql:69-73) both use the same 4 statuses: SYSTEM_FAILURE, CRASHED, INTERRUPTED, COMPLETED_WITH_ERRORS. Other potentially error-carrying statuses like TIMED_OUT or EXPIRED are excluded. This appears intentional but worth confirming — if those statuses can also carry error data, they'd silently produce no fingerprint and be invisible in the errors page.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration

Devin Review found 2 new potential issues.

View 8 additional findings in Devin Review.

devin-ai-integration · 2026-03-05T00:01:33Z

apps/webapp/app/presenters/v3/NextRunListPresenter.server.ts

🟡 Missing errorId in hasFilters computation in NextRunListPresenter

When errorId is provided as a filter, the hasFilters flag in NextRunListPresenter.server.ts is not set to true because errorId is not included in the condition. Every other filter parameter (tasks, versions, statuses, tags, bulkId, scheduleId, batchId, runId, queues, machines) is checked, but errorId is omitted. This means when navigating to the runs page with an errorId filter (e.g., via the "View all runs" link from the error detail page at apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors.$fingerprint/route.tsx:346), the server reports hasFilters: false. This causes the empty-state UI to incorrectly show "No runs yet" instead of "No runs matching filters" if the query returns zero results.

(Refers to lines 89-103)

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-03-05T00:01:35Z

internal-packages/clickhouse/schema/022_create_errors_v1_table.sql

+CREATE MATERIALIZED VIEW trigger_dev.errors_mv_v1
+TO trigger_dev.errors_v1
+AS
+SELECT
+  organization_id,
+  project_id,
+  environment_id,
+  task_identifier,
+  error_fingerprint,
+
+  any(coalesce(nullIf(toString(error.data.type), ''), nullIf(toString(error.data.name), ''), 'Error')) as error_type,
+  any(coalesce(nullIf(substring(toString(error.data.message), 1, 500), ''), 'Unknown error')) as error_message,
+  any(coalesce(substring(toString(error.data.stack), 1, 2000), '')) as sample_stack_trace,
+
+  toDateTime(max(created_at)) as last_seen_date,
+
+  min(created_at) as first_seen,
+  max(created_at) as last_seen,
+  sumState(toUInt64(1)) as occurrence_count,
+  uniqState(task_version) as affected_task_versions,
+
+  anyState(run_id) as sample_run_id,
+  anyState(friendly_id) as sample_friendly_id,
+
+  sumMapState([status], [toUInt64(1)]) as status_distribution
+FROM trigger_dev.task_runs_v2
+WHERE
+  error_fingerprint != ''
+  AND status IN ('SYSTEM_FAILURE', 'CRASHED', 'INTERRUPTED', 'COMPLETED_WITH_ERRORS')
+  AND _is_deleted = 0
+GROUP BY
+  organization_id,
+  project_id,
+  environment_id,
+  task_identifier,
+  error_fingerprint;


🚩 Materialized views use INSERT-time filtering which won't backfill historical data

The materialized views errors_mv_v1 (schema/022_create_errors_v1_table.sql:39) and error_occurrences_mv_v1 (schema/023_create_error_occurrences_v1.sql:33) only trigger on new inserts to task_runs_v2. Existing runs that were inserted before the migration will not have error_fingerprint populated (it defaults to ''), and the MV WHERE clause error_fingerprint != '' ensures they're excluded. This means the errors page will only show errors from runs replicated after the migration. This is likely intentional for a new feature, but worth noting there's no backfill mechanism.

Was this helpful? React with 👍 or 👎 to provide feedback.

matt-aitken added 16 commits March 1, 2026 20:19

First commit for Errors feature

6bfb56b

Improved layout

ab6731c

Fingerprint benchmarks

4e40640

Errors are now by task

6268d51

Not being used

abfceaa

Added chart to the error page

c5b966d

Improved error page layout

cae22ee

Set the default time period to 1d

f8c82fa

Removed unused resource route

7c390a8

Switch to using error_occurrences_v1 so we can do proper time filtering

14e42de

Better time bucketing and nicer presenter logic

9b463db

Time filtering on the error detail page

c97fe88

Errors hidden in the sidebar unless you're an admin

a6f1b8d

Style improvements and run pagination

cad27fb

Runs list filtering by error id

1f58e42

Link to bulk replay. Better styling

0818772

This comment was marked as resolved.

Sign in to view

Refresh the RelativeDateTime each min

6a92df3

coderabbitai bot reviewed Mar 4, 2026

View reviewed changes

matt-aitken added 7 commits March 4, 2026 16:42

Fingerprint ordering fix

25605f1

Delete unused resource route

afda194

Pagination fix

004e703

Made the search params code more robust

550154a

Prevent type collision

29981cd

Time granularity made more robust

2174d5b

Pagination implemeneted

8023370

This comment was marked as resolved.

Sign in to view

Replication benchmark cleanup and readme

67d9057

Fixed materialize view names to match convention

50052a9

coderabbitai bot reviewed Mar 4, 2026

View reviewed changes

matt-aitken added 3 commits March 4, 2026 18:01

Fix for zero durations

d95f65e

Typecheck fix for the benchmark script

90d585c

Fix for taskRuns.test typecheck

c6436e6

devin-ai-integration bot reviewed Mar 4, 2026

View reviewed changes

matt-aitken added 2 commits March 4, 2026 23:21

Removed duplicate PageContainer

8623382

RelativeDateTime first render

1b339ed

devin-ai-integration bot reviewed Mar 5, 2026

View reviewed changes

Uh oh!

Conversation

matt-aitken commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

changeset-bot bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

coderabbitai bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Estimated code review effort

❌ Failed checks (2 warnings)

Uh oh!

This comment was marked as resolved.

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

devin-ai-integration bot Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

matt-aitken commented Mar 4, 2026 •

edited

Loading

changeset-bot bot commented Mar 4, 2026 •

edited

Loading

coderabbitai bot commented Mar 4, 2026 •

edited

Loading