Skip to content

refactor(message-parser): simplify Any fallback to reduce URL/phone parse attempts#39294

Open
Shreyas2004wagh wants to merge 1 commit intoRocketChat:developfrom
Shreyas2004wagh:perf/message-parser-any-fallback-url-phone
Open

refactor(message-parser): simplify Any fallback to reduce URL/phone parse attempts#39294
Shreyas2004wagh wants to merge 1 commit intoRocketChat:developfrom
Shreyas2004wagh:perf/message-parser-any-fallback-url-phone

Conversation

@Shreyas2004wagh
Copy link
Contributor

@Shreyas2004wagh Shreyas2004wagh commented Mar 3, 2026

Summary

  • Optimize Any fallback in grammar by gating URL/phone attempts with cheap prefix checks.
  • Only try fallback phone parsing when the next character is +.
  • Only try fallback URL parsing when scheme/host prefixes are present.

Problem

Any fallback attempted optional phone/URL parsing on fallback characters, which can add unnecessary parsing work on plain-text-heavy inputs.

Changes

  • Replaced Any optional calls to AutolinkedPhone/URL with guarded AnyPhone/AnyURL.
  • Added AnyURLPrefix rules for scheme/host hints before invoking full URL.

Verification

  • Parser build succeeded locally.
  • Spot-checked parser outputs on representative URL/plain-text inputs to confirm behavior is unchanged.
  • Local micro-benchmark on plain-text-heavy input showed reduced parse time.

Fix #39295

Summary by CodeRabbit

Release Notes

  • New Features
    • Enhanced URL and phone number detection in messages, now supporting a broader range of formats and patterns for improved automatic linking.

@dionisio-bot
Copy link
Contributor

dionisio-bot bot commented Mar 3, 2026

Looks like this PR is not ready to merge, because of the following issues:

  • This PR is missing the 'stat: QA assured' label
  • This PR is missing the required milestone or project

Please fix the issues and try again

If you have any trouble, please check the PR guidelines

@changeset-bot
Copy link

changeset-bot bot commented Mar 3, 2026

⚠️ No Changeset found

Latest commit: f940e8d

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 3, 2026

Walkthrough

The grammar file is refactored to replace specific URL and phone detection rules with more generic composite rules (AnyPhone, AnyURL, and related prefixes), broadening the pattern matching for autolinked content while preserving surrounding text handling logic.

Changes

Cohort / File(s) Summary
Grammar Rule Refactoring
packages/message-parser/src/grammar.pegjs
Replaces specific URL/phone extraction path with generic compositions. Introduces new composite rules: AnyPhone, AnyURL, AnyURLPrefix, AnyURLSchemePrefix, and AnyURLHostPrefix. Expands URL and phone number detection surface without altering surrounding text logic.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested labels

type: chore

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main optimization: simplifying the Any fallback rule by adding prefix checks to reduce unnecessary URL/phone parsing attempts in the message parser.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Shreyas2004wagh Shreyas2004wagh changed the title perf(message-parser): simplify Any fallback to reduce URL/phone parse attempts refactor(message-parser): simplify Any fallback to reduce URL/phone parse attempts Mar 3, 2026
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
packages/message-parser/src/grammar.pegjs (1)

809-809: Use Digit |1..3| instead of Digits |1..3| for clearer intent and better performance

On line 809, Digits |1..3| is less precise because Digits is defined as $Digit+ (a greedy match of one or more digits). When applied with the quantifier, the first invocation consumes all consecutive digits, making the repetition bound ambiguous. Replacing with Digit |1..3| explicitly matches 1–3 individual digit characters, eliminating backtracking ambiguity and improving parser selectivity. This pattern is consistent with other uses in the grammar (e.g., Digit |2|, Digit |4|).

Proposed change
-  / Digits |1..3| "." Digits |1..3| "."
+  / Digit |1..3| "." Digit |1..3| "."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/src/grammar.pegjs` at line 809, Replace the use of
the `Digits |1..3|` quantified symbol with `Digit |1..3|` to avoid greedy
consumption by the `Digits` rule; locate the rule containing the sequence `/
Digits |1..3| "." Digits |1..3| "."` in grammar.pegjs and change both
occurrences to `Digit |1..3|` so each quantifier applies to individual digit
characters (referencing the `Digit` and `Digits` rules) to improve selectivity
and eliminate ambiguous backtracking.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@packages/message-parser/src/grammar.pegjs`:
- Line 809: Replace the use of the `Digits |1..3|` quantified symbol with `Digit
|1..3|` to avoid greedy consumption by the `Digits` rule; locate the rule
containing the sequence `/ Digits |1..3| "." Digits |1..3| "."` in grammar.pegjs
and change both occurrences to `Digit |1..3|` so each quantifier applies to
individual digit characters (referencing the `Digit` and `Digits` rules) to
improve selectivity and eliminate ambiguous backtracking.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b7ff7b2 and f940e8d.

📒 Files selected for processing (1)
  • packages/message-parser/src/grammar.pegjs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: cubic · AI code reviewer
🔇 Additional comments (1)
packages/message-parser/src/grammar.pegjs (1)

797-805: Good perf-oriented guardrails in fallback parsing

The new AnyPhone/AnyURL prefix-gated fallback on Lines 797-805 is a solid optimization and keeps expensive URL/phone attempts off plain-text-heavy paths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

chore(message-parser): tracking issue for parser correctness, performance, and test hardening

2 participants