Conversation
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds end-to-end error analytics and optional error fingerprinting: a fingerprinting utility and normalization; optional fingerprint computation in the runs replication service; schema migrations and new ClickHouse tables/materialized views (error aggregations and occurrences) plus query/utility APIs; server presenters for errors list and error-group detail; new Remix routes and UI for Errors index and detail pages; RunFilters, TaskRunsTable, and search input updates to support errorId filtering; path helpers, semver and time-granularity utilities; ClickHouse query builder HAVING support; tests, benchmarks, and related type/interface updates. Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
apps/webapp/app/components/primitives/DateTime.tsx (1)
377-390: Avoid initial blank text and duplicate update path.At Line 377 the state starts as
"", then Line 388-390 immediately sets it. Consider initializing state fromgetRelativeText(realDate)and collapsing to a single effect to remove the empty first paint and extra update cycle.♻️ Proposed simplification
- const [relativeText, setRelativeText] = useState(""); + const [relativeText, setRelativeText] = useState(() => getRelativeText(realDate)); - // Every 60s refresh - useEffect(() => { - const interval = setInterval(() => { - setRelativeText(getRelativeText(realDate)); - }, 60_000); - return () => clearInterval(interval); - }, [realDate]); - - // On first render - useEffect(() => { - setRelativeText(getRelativeText(realDate)); - }, [realDate]); + useEffect(() => { + setRelativeText(getRelativeText(realDate)); + const interval = setInterval(() => { + setRelativeText(getRelativeText(realDate)); + }, 60_000); + return () => clearInterval(interval); + }, [realDate]);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/webapp/app/components/primitives/DateTime.tsx` around lines 377 - 390, Initialize the relativeText state using getRelativeText(realDate) instead of an empty string and replace the two effects with a single useEffect that (1) sets setRelativeText(getRelativeText(realDate)) immediately when realDate changes and (2) starts the 60s interval that updates setRelativeText(getRelativeText(realDate)) and is cleaned up on unmount; update references to relativeText, setRelativeText, getRelativeText and the interval logic accordingly so there is no initial blank render or duplicated update path.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@apps/webapp/app/components/primitives/DateTime.tsx`:
- Around line 377-390: Initialize the relativeText state using
getRelativeText(realDate) instead of an empty string and replace the two effects
with a single useEffect that (1) sets setRelativeText(getRelativeText(realDate))
immediately when realDate changes and (2) starts the 60s interval that updates
setRelativeText(getRelativeText(realDate)) and is cleaned up on unmount; update
references to relativeText, setRelativeText, getRelativeText and the interval
logic accordingly so there is no initial blank render or duplicated update path.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: ee2d7330-486f-49ed-96f8-1dcc497045c6
📒 Files selected for processing (1)
apps/webapp/app/components/primitives/DateTime.tsx
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: sdk-compat / Cloudflare Workers
- GitHub Check: sdk-compat / Bun Runtime
- GitHub Check: sdk-compat / Node.js 20.20 (ubuntu-latest)
- GitHub Check: sdk-compat / Node.js 22.12 (ubuntu-latest)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: sdk-compat / Deno Runtime
- GitHub Check: typecheck / typecheck
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: Analyze (javascript-typescript)
🧰 Additional context used
📓 Path-based instructions (7)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead
**/*.{ts,tsx}: In TypeScript SDK usage, always import from@trigger.dev/sdk, never from@trigger.dev/sdk/v3or use deprecated client.defineJob
Import from@trigger.dev/coresubpaths only, never from the root
Use the Run Engine 2.0 (@internal/run-engine) and redis-worker for all new work, not legacy V1 MarQS queue or deprecated V1 functions
Files:
apps/webapp/app/components/primitives/DateTime.tsx
{packages/core,apps/webapp}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use zod for validation in packages/core and apps/webapp
Files:
apps/webapp/app/components/primitives/DateTime.tsx
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use function declarations instead of default exports
Files:
apps/webapp/app/components/primitives/DateTime.tsx
apps/webapp/app/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
Access all environment variables through the
envexport ofenv.server.tsinstead of directly accessingprocess.envin the Trigger.dev webapp
Files:
apps/webapp/app/components/primitives/DateTime.tsx
apps/webapp/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
apps/webapp/**/*.{ts,tsx}: When importing from@trigger.dev/corein the webapp, use subpath exports from the package.json instead of importing from the root path
Follow the Remix 2.1.0 and Express server conventions when updating the main trigger.dev webapp
Files:
apps/webapp/app/components/primitives/DateTime.tsx
**/*.{js,ts,jsx,tsx,json,md,yaml,yml}
📄 CodeRabbit inference engine (AGENTS.md)
Format code using Prettier before committing
Files:
apps/webapp/app/components/primitives/DateTime.tsx
apps/**/*.{ts,tsx,js}
📄 CodeRabbit inference engine (CLAUDE.md)
When modifying only server components (apps/webapp/, apps/supervisor/, etc.) with no package changes, add a .server-changes/ file instead of a changeset
Files:
apps/webapp/app/components/primitives/DateTime.tsx
🧠 Learnings (2)
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Use `useRun`, `useRealtimeRun` and other SWR/realtime hooks from `trigger.dev/react-hooks` for data fetching
Applied to files:
apps/webapp/app/components/primitives/DateTime.tsx
📚 Learning: 2026-02-11T16:37:32.429Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3019
File: apps/webapp/app/components/primitives/charts/Card.tsx:26-30
Timestamp: 2026-02-11T16:37:32.429Z
Learning: In projects using react-grid-layout, avoid relying on drag-handle class to imply draggability. Ensure drag-handle elements only affect dragging when the parent grid item is configured draggable in the layout; conditionally apply cursor styles based on the draggable prop. This improves correctness and accessibility.
Applied to files:
apps/webapp/app/components/primitives/DateTime.tsx
🔇 Additional comments (1)
apps/webapp/app/components/primitives/DateTime.tsx (1)
379-385: Nice fix for live relative-time updates.The interval refresh with proper cleanup keeps
RelativeDateTimecurrent and avoids the previously frozen output issue.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
apps/webapp/test/runsReplicationBenchmark.README.md (2)
126-201:⚠️ Potential issue | 🟡 MinorFenced code block missing language identifier (duplicate)
The output example code block lacks a language identifier, triggering
MD040. Addingtextwill satisfy markdownlint and improve rendering.📝 Proposed fix
-``` +```text ================================================================================ BENCHMARK: baseline-no-fingerprinting🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/webapp/test/runsReplicationBenchmark.README.md` around lines 126 - 201, The fenced code block in the README sample lacks a language identifier, causing MD040; update the opening fence for the large benchmark example (the block that begins with ``` and the line "================================================================================" / "BENCHMARK: baseline-no-fingerprinting") to include a language token such as text (e.g., replace ``` with ```text) so markdownlint passes and rendering is explicit.
15-31:⚠️ Potential issue | 🟡 MinorFenced code block missing language identifier (duplicate)
The ASCII diagram code block lacks a language identifier, triggering
MD040. Addingtextwill satisfy markdownlint and improve rendering.📝 Proposed fix
-``` +```text ┌─────────────────┐ ┌──────────────────────┐ │ Producer │ │ Benchmark Test │🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/webapp/test/runsReplicationBenchmark.README.md` around lines 15 - 31, The fenced ASCII diagram block in apps/webapp/test/runsReplicationBenchmark.README.md is missing a language identifier which triggers MD040; update the opening fence(s) for the diagram (the triple-backtick block containing the ASCII art) to include a language identifier such as "text" so the block becomes ```text (and do this for any duplicate diagram blocks), leaving the diagram content unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal-packages/clickhouse/schema/022_create_errors_v1_table.sql`:
- Around line 65-68: The WHERE clause that builds the errors aggregation
currently filters status IN ('SYSTEM_FAILURE', 'CRASHED', 'INTERRUPTED',
'COMPLETED_WITH_ERRORS') and omits two error-related final statuses; update that
filter to also include 'EXPIRED' and 'TIMED_OUT' so all failed TaskRunStatus
values are captured (locate the WHERE block referencing error_fingerprint and
status in the errors table creation SQL and add 'EXPIRED' and 'TIMED_OUT' to the
IN list).
---
Duplicate comments:
In `@apps/webapp/test/runsReplicationBenchmark.README.md`:
- Around line 126-201: The fenced code block in the README sample lacks a
language identifier, causing MD040; update the opening fence for the large
benchmark example (the block that begins with ``` and the line
"================================================================================"
/ "BENCHMARK: baseline-no-fingerprinting") to include a language token such as
text (e.g., replace ``` with ```text) so markdownlint passes and rendering is
explicit.
- Around line 15-31: The fenced ASCII diagram block in
apps/webapp/test/runsReplicationBenchmark.README.md is missing a language
identifier which triggers MD040; update the opening fence(s) for the diagram
(the triple-backtick block containing the ASCII art) to include a language
identifier such as "text" so the block becomes ```text (and do this for any
duplicate diagram blocks), leaving the diagram content unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 910e5089-06e8-4f64-b451-a6ae55d622d3
📒 Files selected for processing (4)
apps/webapp/test/runsReplicationBenchmark.README.mdapps/webapp/test/runsReplicationBenchmark.test.tsinternal-packages/clickhouse/schema/022_create_errors_v1_table.sqlinternal-packages/clickhouse/schema/023_create_error_occurrences_v1.sql
🚧 Files skipped from review as they are similar to previous changes (2)
- apps/webapp/test/runsReplicationBenchmark.test.ts
- internal-packages/clickhouse/schema/023_create_error_occurrences_v1.sql
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: sdk-compat / Node.js 20.20 (ubuntu-latest)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: sdk-compat / Node.js 22.12 (ubuntu-latest)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: sdk-compat / Cloudflare Workers
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: sdk-compat / Bun Runtime
- GitHub Check: typecheck / typecheck
- GitHub Check: sdk-compat / Deno Runtime
- GitHub Check: Analyze (javascript-typescript)
🧰 Additional context used
📓 Path-based instructions (2)
internal-packages/clickhouse/schema/*.sql
📄 CodeRabbit inference engine (internal-packages/clickhouse/CLAUDE.md)
ClickHouse SQL migrations should follow Goose format with Up and Down sections, stored as numbered files in the
schema/directory
Files:
internal-packages/clickhouse/schema/022_create_errors_v1_table.sql
**/*.{js,ts,jsx,tsx,json,md,yaml,yml}
📄 CodeRabbit inference engine (AGENTS.md)
Format code using Prettier before committing
Files:
apps/webapp/test/runsReplicationBenchmark.README.md
🧠 Learnings (9)
📚 Learning: 2026-03-02T12:43:08.166Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: internal-packages/clickhouse/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:08.166Z
Learning: Applies to internal-packages/clickhouse/schema/*.sql : ClickHouse SQL migrations should follow Goose format with Up and Down sections, stored as numbered files in the `schema/` directory
Applied to files:
internal-packages/clickhouse/schema/022_create_errors_v1_table.sql
📚 Learning: 2026-03-02T12:43:08.166Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: internal-packages/clickhouse/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:08.166Z
Learning: Use `_mv_v1` suffix for ClickHouse materialized views
Applied to files:
internal-packages/clickhouse/schema/022_create_errors_v1_table.sql
📚 Learning: 2026-03-02T12:43:08.166Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: internal-packages/clickhouse/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:08.166Z
Learning: Use `_v1`, `_v2` suffixes for ClickHouse table versioning
Applied to files:
internal-packages/clickhouse/schema/022_create_errors_v1_table.sql
📚 Learning: 2026-03-02T12:43:02.539Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: docs/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:02.539Z
Learning: Applies to docs/**/*.mdx : Use language tags in code fences: `typescript`, `bash`, `json`
Applied to files:
apps/webapp/test/runsReplicationBenchmark.README.md
📚 Learning: 2026-03-02T12:43:25.254Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: internal-packages/run-engine/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:25.254Z
Learning: Applies to internal-packages/run-engine/src/engine/tests/**/*.test.ts : Implement tests for RunEngine in `src/engine/tests/` using testcontainers for Redis and PostgreSQL containerization
Applied to files:
apps/webapp/test/runsReplicationBenchmark.README.md
📚 Learning: 2026-03-03T13:07:33.177Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3166
File: internal-packages/run-engine/src/batch-queue/tests/index.test.ts:711-713
Timestamp: 2026-03-03T13:07:33.177Z
Learning: In `internal-packages/run-engine/src/batch-queue/tests/index.test.ts`, test assertions for rate limiter stubs can use `toBeGreaterThanOrEqual` rather than exact equality (`toBe`) because the consumer loop may call the rate limiter during empty pops in addition to actual item processing, and this over-calling is acceptable in integration tests.
Applied to files:
apps/webapp/test/runsReplicationBenchmark.README.md
📚 Learning: 2026-03-02T12:42:41.110Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-02T12:42:41.110Z
Learning: Applies to **/*.test.{ts,tsx,js} : Test files using Redis or PostgreSQL should use testcontainers helpers (redisTest, postgresTest, containerTest) instead of mocks
Applied to files:
apps/webapp/test/runsReplicationBenchmark.README.md
📚 Learning: 2026-03-02T12:42:56.114Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: apps/webapp/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:42:56.114Z
Learning: Applies to apps/webapp/app/v3/services/**/*.server.ts : When editing services that branch on `RunEngineVersion` to support both V1 and V2 (e.g., `cancelTaskRun.server.ts`, `batchTriggerV3.server.ts`), only modify V2 code paths
Applied to files:
apps/webapp/test/runsReplicationBenchmark.README.md
📚 Learning: 2026-01-15T10:48:02.687Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-15T10:48:02.687Z
Learning: Applies to **/*.test.{ts,tsx,js,jsx} : Tests should avoid mocks or stubs and use the helpers from `internal/testcontainers` when Redis or Postgres are needed
Applied to files:
apps/webapp/test/runsReplicationBenchmark.README.md
🪛 LanguageTool
apps/webapp/test/runsReplicationBenchmark.README.md
[uncategorized] ~80-~80: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...icationBenchmark.test.ts --run ### High Volume Benchmark bash cd apps/webapp BENCH...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
🪛 markdownlint-cli2 (0.21.0)
apps/webapp/test/runsReplicationBenchmark.README.md
[warning] 15-15: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
[warning] 126-126: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (2)
internal-packages/clickhouse/schema/022_create_errors_v1_table.sql (2)
1-36: LGTM! Table structure follows conventions and is well-designed.The table uses appropriate patterns:
- AggregatingMergeTree with proper ORDER BY matching the GROUP BY key
- SimpleAggregateFunction for timestamp bounds (avoids binary state encoding)
- AggregateFunction with proper state functions for counts and samples
- 90-day TTL on
last_seen_datefor data lifecycle managementThe String columns (
error_type,error_message,sample_stack_trace) will have non-deterministic values after part merges, but this is acceptable since they represent sample/representative data from the error group.
76-78: LGTM! Down migration is correct.The migration properly drops the materialized view before the table (correct dependency order) and uses
IF EXISTSfor idempotency.
| WHERE | ||
| error_fingerprint != '' | ||
| AND status IN ('SYSTEM_FAILURE', 'CRASHED', 'INTERRUPTED', 'COMPLETED_WITH_ERRORS') | ||
| AND _is_deleted = 0 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find status enum/type definitions
echo "=== Searching for status type definitions ==="
rg -n 'enum.*Status|type.*Status|const.*Status' --type=ts --type=js -g '!*test*' -g '!*spec*' | head -100
echo ""
echo "=== Searching for status string literals in task context ==="
rg -n "(SYSTEM_FAILURE|CRASHED|INTERRUPTED|COMPLETED_WITH_ERRORS|FAILURE|FAILED|TIMED_OUT)" --type=ts --type=js | grep -i 'status\|enum' | head -50
echo ""
echo "=== Check the SQL file structure and format ==="
cat -n internal-packages/clickhouse/schema/022_create_errors_v1_table.sql | head -30
echo ""
echo "=== Check for Up/Down sections (Goose format) ==="
grep -n "-- +goose Up\|-- +goose Down" internal-packages/clickhouse/schema/022_create_errors_v1_table.sqlRepository: triggerdotdev/trigger.dev
Length of output: 19301
🏁 Script executed:
#!/bin/bash
# Find and display the TaskRunStatus enum definition
echo "=== TaskRunStatus enum definition ==="
sed -n '22,42p' packages/core/src/v3/schemas/runEngine.ts
echo ""
echo "=== finalStatuses definition ==="
sed -n '44,52p' internal-packages/run-engine/src/engine/statuses.tsRepository: triggerdotdev/trigger.dev
Length of output: 1034
Add EXPIRED and TIMED_OUT to the status filter in the errors table.
The current filter includes SYSTEM_FAILURE, CRASHED, INTERRUPTED, and COMPLETED_WITH_ERRORS, but it's missing EXPIRED and TIMED_OUT, which are defined as error-related final statuses in the TaskRunStatus enum. These should be included to ensure all failed runs are aggregated into the errors table.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal-packages/clickhouse/schema/022_create_errors_v1_table.sql` around
lines 65 - 68, The WHERE clause that builds the errors aggregation currently
filters status IN ('SYSTEM_FAILURE', 'CRASHED', 'INTERRUPTED',
'COMPLETED_WITH_ERRORS') and omits two error-related final statuses; update that
filter to also include 'EXPIRED' and 'TIMED_OUT' so all failed TaskRunStatus
values are captured (locate the WHERE block referencing error_fingerprint and
status in the errors table creation SQL and add 'EXPIRED' and 'TIMED_OUT' to the
IN list).
There was a problem hiding this comment.
🚩 Missing .server-changes file for server-only changes
Per CONTRIBUTING.md and CLAUDE.md, PRs that change server components (apps/webapp/) without package changes should include a .server-changes/ file for release notes. This PR adds significant new server functionality (error fingerprinting, new routes, ClickHouse schema) but I don't see a corresponding .server-changes/ markdown file. There are also published package changes (new ErrorId in packages/core), so a changeset may be more appropriate — but neither appears to be present.
Was this helpful? React with 👍 or 👎 to provide feedback.
...pp.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors.$fingerprint/route.tsx
Outdated
Show resolved
Hide resolved
| const errorFingerprint = ( | ||
| !this._disableErrorFingerprinting && | ||
| ['SYSTEM_FAILURE', 'CRASHED', 'INTERRUPTED', 'COMPLETED_WITH_ERRORS'].includes(run.status) | ||
| ) | ||
| ? calculateErrorFingerprint(run.error) | ||
| : ''; |
There was a problem hiding this comment.
🚩 Error fingerprinting only covers 4 of the failure statuses
The fingerprinting logic at apps/webapp/app/services/runsReplicationService.server.ts:864 and the ClickHouse materialized views (022_create_errors_v1_table.sql:67, 023_create_error_occurrences_v1.sql:69-73) both use the same 4 statuses: SYSTEM_FAILURE, CRASHED, INTERRUPTED, COMPLETED_WITH_ERRORS. Other potentially error-carrying statuses like TIMED_OUT or EXPIRED are excluded. This appears intentional but worth confirming — if those statuses can also carry error data, they'd silently produce no fingerprint and be invisible in the errors page.
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
🟡 Missing errorId in hasFilters computation in NextRunListPresenter
When errorId is provided as a filter, the hasFilters flag in NextRunListPresenter.server.ts is not set to true because errorId is not included in the condition. Every other filter parameter (tasks, versions, statuses, tags, bulkId, scheduleId, batchId, runId, queues, machines) is checked, but errorId is omitted. This means when navigating to the runs page with an errorId filter (e.g., via the "View all runs" link from the error detail page at apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors.$fingerprint/route.tsx:346), the server reports hasFilters: false. This causes the empty-state UI to incorrectly show "No runs yet" instead of "No runs matching filters" if the query returns zero results.
(Refers to lines 89-103)
Was this helpful? React with 👍 or 👎 to provide feedback.
| CREATE MATERIALIZED VIEW trigger_dev.errors_mv_v1 | ||
| TO trigger_dev.errors_v1 | ||
| AS | ||
| SELECT | ||
| organization_id, | ||
| project_id, | ||
| environment_id, | ||
| task_identifier, | ||
| error_fingerprint, | ||
|
|
||
| any(coalesce(nullIf(toString(error.data.type), ''), nullIf(toString(error.data.name), ''), 'Error')) as error_type, | ||
| any(coalesce(nullIf(substring(toString(error.data.message), 1, 500), ''), 'Unknown error')) as error_message, | ||
| any(coalesce(substring(toString(error.data.stack), 1, 2000), '')) as sample_stack_trace, | ||
|
|
||
| toDateTime(max(created_at)) as last_seen_date, | ||
|
|
||
| min(created_at) as first_seen, | ||
| max(created_at) as last_seen, | ||
| sumState(toUInt64(1)) as occurrence_count, | ||
| uniqState(task_version) as affected_task_versions, | ||
|
|
||
| anyState(run_id) as sample_run_id, | ||
| anyState(friendly_id) as sample_friendly_id, | ||
|
|
||
| sumMapState([status], [toUInt64(1)]) as status_distribution | ||
| FROM trigger_dev.task_runs_v2 | ||
| WHERE | ||
| error_fingerprint != '' | ||
| AND status IN ('SYSTEM_FAILURE', 'CRASHED', 'INTERRUPTED', 'COMPLETED_WITH_ERRORS') | ||
| AND _is_deleted = 0 | ||
| GROUP BY | ||
| organization_id, | ||
| project_id, | ||
| environment_id, | ||
| task_identifier, | ||
| error_fingerprint; |
There was a problem hiding this comment.
🚩 Materialized views use INSERT-time filtering which won't backfill historical data
The materialized views errors_mv_v1 (schema/022_create_errors_v1_table.sql:39) and error_occurrences_mv_v1 (schema/023_create_error_occurrences_v1.sql:33) only trigger on new inserts to task_runs_v2. Existing runs that were inserted before the migration will not have error_fingerprint populated (it defaults to ''), and the MV WHERE clause error_fingerprint != '' ensures they're excluded. This means the errors page will only show errors from runs replicated after the migration. This is likely intentional for a new feature, but worth noting there's no backfill mechanism.
Was this helpful? React with 👍 or 👎 to provide feedback.
A top-level Errors page that aggregates errors from failed runs with occurrences metrics.
feature-errors-mvp-compressed-louder.mp4
Errors are “fingerprinted” so similar errors are grouped together (e.g. has an ID in the error message).
You can view an individual error to view a timeline of when it fired, the runs, and bulk replay them.