Skip to content

fix: parse_inline drops links at position 0 and marks() loses href#36

Open
ASRagab wants to merge 4 commits intoma2za:mainfrom
ASRagab:fix/parse-inline-link-bugs
Open

fix: parse_inline drops links at position 0 and marks() loses href#36
ASRagab wants to merge 4 commits intoma2za:mainfrom
ASRagab:fix/parse-inline-link-bugs

Conversation

@ASRagab
Copy link

@ASRagab ASRagab commented Mar 3, 2026

Summary

Two bugs in inline link handling that cause all links created via from_markdown() to have null hrefs, and links at the start of a line to be silently dropped.

Bug 1: Links at position 0 are dropped

parse_inline() skips links that start at position 0 in the text. The image-exclusion check:

if match.start() > 0 and text[match.start()-1:match.start()+1] != "![":

Short-circuits to False when match.start() == 0, causing the link to be skipped entirely.

Fix: Check match.start() == 0 or text[match.start() - 1] != "!" instead.

Repro:

parse_inline('[GPT](https://openai.com/)')
# Before: [{'content': '[GPT](https://openai.com/)'}]  # raw text, not a link
# After:  [{'content': 'GPT', 'marks': [{'type': 'link', 'attrs': {'href': 'https://openai.com/'}}]}]

Bug 2: All link hrefs are null

parse_inline() returns marks with the structure {'type': 'link', 'attrs': {'href': url}}, but marks() reads from mark.get('href') at the top level — missing the nested attrs entirely.

Fix: Fall back to mark.get('attrs', {}).get('href') when top-level href is not present. This preserves backward compatibility with the top-level format shown in the README examples.

Repro:

post = Post('Test', '', user_id=1)
post.from_markdown('Visit [Example](https://example.com)')
body = json.loads(post.get_draft()['draft_body'])
# Before: href is null for all links
# After:  href is 'https://example.com'

Tests

Added tests/substack/test_post.py with 6 tests covering:

  • Links at position 0
  • Multiple links on same line
  • Image exclusion (not parsed as link)
  • Links in middle of text
  • href from attrs (from_markdown path)
  • href from top-level (manual add path)

All tests pass. Pre-commit hooks (black, isort, trailing-whitespace, end-of-file-fixer) all pass.

cellarius added 2 commits March 3, 2026 02:59
Two bugs in inline link handling:

1. parse_inline(): Links at the start of a line (position 0) were silently
   dropped. The image-exclusion check `match.start() > 0 and ...` would
   short-circuit to False when the link started at position 0, causing it
   to be skipped entirely. Fixed by checking `match.start() == 0 or
   text[match.start() - 1] != '!'` instead.

2. marks(): All link hrefs were set to null. parse_inline() returns marks
   with the structure `{'type': 'link', 'attrs': {'href': url}}`, but
   marks() only checked `mark.get('href')` at the top level. Added
   fallback to `mark.get('attrs', {}).get('href')` to handle both
   the attrs-nested format from parse_inline and the top-level format
   used in the README examples.

Added tests covering both bugs plus regression tests for image exclusion.
@ma2za
Copy link
Owner

ma2za commented Mar 3, 2026

hi, thank you for the contribution... can you remove all the formatting changes please? they make it hard to review

@ASRagab
Copy link
Author

ASRagab commented Mar 3, 2026

Done — I reverted the formatting-only changes so the diff is now just the two functional fixes (position-0 links + link href in marks) plus the new regression tests. Sorry about the noise earlier (there's still hidden whitespace it's removing).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants