Skip to content

fix(tracing): nest LangChain spans under Langflow component spans in Phoenix#12008

Open
0xNekr wants to merge 1 commit intolangflow-ai:mainfrom
0xNekr:fix/phoenix-span-nesting
Open

fix(tracing): nest LangChain spans under Langflow component spans in Phoenix#12008
0xNekr wants to merge 1 commit intolangflow-ai:mainfrom
0xNekr:fix/phoenix-span-nesting

Conversation

@0xNekr
Copy link

@0xNekr 0xNekr commented Mar 3, 2026

Summary

LangChain auto-instrumented spans (created by LangChainInstrumentor) appear as separate top-level traces in Phoenix/Arize instead of being nested under the Langflow component span that triggered them.

This happens because add_trace creates a child span via the propagator context but doesn't attach it to the active OpenTelemetry context. When LangChain runs (e.g. an OpenAI call inside a component), the instrumentor sees no active parent span and creates a new root trace.

Before

The Traces tab in Phoenix shows N+1 entries per flow execution: one root "Langflow" span plus separate disconnected LangChain spans (OpenAI, RunnableSequence, etc.).

After

The Traces tab shows 1 entry per flow execution. All LangChain-instrumented spans are properly nested under the Langflow component span that triggered them.

Changes

In arize_phoenix.py:

  1. __init__: Added _context_tokens dict to track OTel context tokens per trace
  2. add_trace: After creating the child span, attach it to the OTel context so LangChain instrumentor picks it up as the parent
  3. end_trace: Detach from OTel context before ending the span to restore the previous context

Test plan

  • Deploy Langflow with Phoenix tracing enabled and LangChainInstrumentor active
  • Execute a flow that includes LLM calls (e.g. OpenAI via LangChain)
  • Open Phoenix Traces tab and verify:
    • Only one root trace per flow execution (not multiple disconnected traces)
    • LangChain spans (OpenAI, RunnableSequence, etc.) appear nested under the component span
    • Spans tab still shows all individual spans with correct timing

Summary by CodeRabbit

  • Chores
    • Enhanced internal tracing infrastructure to improve observability and reliability of trace management.

…Phoenix

LangChain auto-instrumented spans (from LangChainInstrumentor) appear as
separate top-level traces in Phoenix/Arize because they don't inherit the
OpenTelemetry parent context from the Langflow component span.

This attaches each child span to the OTel context during component
execution (add_trace) and detaches it when the component finishes
(end_trace). LangChain-instrumented spans (e.g. OpenAI calls,
RunnableSequence) automatically become children of the active Langflow
component span.

Before: Traces tab shows N+1 entries per flow execution (one root
Langflow span + separate LangChain spans).
After: Traces tab shows 1 entry per flow execution with all spans
properly nested under the root trace.
@github-actions github-actions bot added the community Pull Request from an external contributor label Mar 3, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 3, 2026

Walkthrough

OTEL context token management is added to the tracing service. The implementation introduces internal state tracking to store context tokens per trace ID and manages their attachment/detachment during trace lifecycle operations.

Changes

Cohort / File(s) Summary
OTEL Context Token Management
src/backend/base/langflow/services/tracing/arize_phoenix.py
Introduces _context_tokens dictionary for tracking OTEL context tokens per trace ID. Adds token storage during span creation and retrieval/detachment during trace end operations.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 2 warnings)

Check name Status Explanation Resolution
Test Coverage For New Implementations ❌ Error PR modifies arize_phoenix.py to add OpenTelemetry context tracking but includes no test coverage for the new _context_tokens functionality or context attach/detach operations. Add test_arize_phoenix.py covering context token storage, OTel context attachment after add_trace, context detachment during end_trace, and span nesting verification.
Test Quality And Coverage ⚠️ Warning PR introduces critical OpenTelemetry context token management and span nesting in ArizePhoenixTracer but provides no dedicated unit tests; existing tests use mocks that don't exercise the new functionality. Add comprehensive unit tests for ArizePhoenixTracer in test_arize_phoenix.py covering context token tracking, OTel attachment/detachment, exception handling, and concurrent trace scenarios.
Test File Naming And Structure ⚠️ Warning Pull request lacks test coverage for new OTEL context management functionality in arize_phoenix.py, violating established repository testing patterns. Create test file test_arize_phoenix.py with tests for context attachment/detachment, error handling, edge cases, and implement try/finally pattern for detach calls.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: fixing span nesting in Phoenix tracing by ensuring LangChain spans appear under Langflow component spans, which directly matches the PR's core objective.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Excessive Mock Usage Warning ✅ Passed This custom check on excessive mock usage in tests is not applicable to this PR because no test files were modified or created in these changes.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions bot added bug Something isn't working and removed bug Something isn't working labels Mar 3, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/backend/base/langflow/services/tracing/arize_phoenix.py (1)

291-313: ⚠️ Potential issue | 🟠 Major

Move detach/end cleanup into a finally block to avoid context/span leaks on mid-method exceptions.

If serialization or set_attribute fails before Line 305, cleanup is skipped and the active OTel context can remain attached.

💡 Proposed hardening diff
@@
-        processed_outputs = self._convert_to_arize_phoenix_types(outputs) if outputs else {}
-        if processed_outputs:
-            child_span.set_attribute(SpanAttributes.OUTPUT_VALUE, self._safe_json_dumps(processed_outputs))
-            child_span.set_attribute(SpanAttributes.OUTPUT_MIME_TYPE, OpenInferenceMimeTypeValues.JSON.value)
-
-        logs_dicts = [log if isinstance(log, dict) else log.model_dump() for log in logs]
-        processed_logs = (
-            self._convert_to_arize_phoenix_types({log.get("name"): log for log in logs_dicts}) if logs else {}
-        )
-        if processed_logs:
-            child_span.set_attribute("logs", self._safe_json_dumps(processed_logs))
-
-        self._set_span_status(child_span, error)
-
-        # Detach from OTel context before ending the span
-        token = self._context_tokens.pop(trace_id, None)
-        if token is not None:
-            from opentelemetry import context as otel_context
-
-            otel_context.detach(token)
-
-        child_span.end(end_time=self._get_current_timestamp())
-        self.child_spans.pop(trace_id)
+        from opentelemetry import context as otel_context
+        try:
+            processed_outputs = self._convert_to_arize_phoenix_types(outputs) if outputs else {}
+            if processed_outputs:
+                child_span.set_attribute(SpanAttributes.OUTPUT_VALUE, self._safe_json_dumps(processed_outputs))
+                child_span.set_attribute(SpanAttributes.OUTPUT_MIME_TYPE, OpenInferenceMimeTypeValues.JSON.value)
+
+            logs_dicts = [log if isinstance(log, dict) else log.model_dump() for log in logs]
+            processed_logs = (
+                self._convert_to_arize_phoenix_types({log.get("name"): log for log in logs_dicts}) if logs else {}
+            )
+            if processed_logs:
+                child_span.set_attribute("logs", self._safe_json_dumps(processed_logs))
+
+            self._set_span_status(child_span, error)
+        finally:
+            token = self._context_tokens.pop(trace_id, None)
+            if token is not None:
+                otel_context.detach(token)
+            child_span.end(end_time=self._get_current_timestamp())
+            self.child_spans.pop(trace_id, None)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/backend/base/langflow/services/tracing/arize_phoenix.py` around lines 291
- 313, The cleanup that detaches OTel context and ends/removes child spans must
run even if attribute-setting or serialization fails: wrap the block that sets
attributes (calls to _convert_to_arize_phoenix_types, _safe_json_dumps,
child_span.set_attribute and _set_span_status) in a try/finally so the finally
always runs the detach (use self._context_tokens.pop(trace_id) and call
otel_context.detach(token) if token), child_span.end(end_time=...) and
self.child_spans.pop(trace_id). Ensure any imports (from opentelemetry import
context as otel_context) remain accessible in the finally and avoid swallowing
the original exception (re-raise after cleanup).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/backend/base/langflow/services/tracing/arize_phoenix.py`:
- Around line 267-273: The context detach currently can be skipped or raise if
exceptions occur; modify end_trace() so that the span status/attribute setting
(calls to _set_span_status, any output/log/status operations) are executed
inside a try block and the context token detach and span cleanup happen in a
finally: pop the token from self._context_tokens using trace_id, call
otel_context.detach(token) only if token is not None (handle possible ValueError
by swallowing/ignoring it), then call
child_span.end(end_time=self._get_current_timestamp()) and remove the span from
self.child_spans; this ensures attach() made in add_trace() is always detached
even on errors.

---

Outside diff comments:
In `@src/backend/base/langflow/services/tracing/arize_phoenix.py`:
- Around line 291-313: The cleanup that detaches OTel context and ends/removes
child spans must run even if attribute-setting or serialization fails: wrap the
block that sets attributes (calls to _convert_to_arize_phoenix_types,
_safe_json_dumps, child_span.set_attribute and _set_span_status) in a
try/finally so the finally always runs the detach (use
self._context_tokens.pop(trace_id) and call otel_context.detach(token) if
token), child_span.end(end_time=...) and self.child_spans.pop(trace_id). Ensure
any imports (from opentelemetry import context as otel_context) remain
accessible in the finally and avoid swallowing the original exception (re-raise
after cleanup).

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2fc6ca8 and aaf1ebb.

📒 Files selected for processing (1)
  • src/backend/base/langflow/services/tracing/arize_phoenix.py

Comment on lines +267 to +273
# Attach child span to OTel context so LangChain auto-instrumented
# spans (via LangChainInstrumentor) become children of this span
from opentelemetry import context as otel_context
from opentelemetry.trace import set_span_in_context

self._context_tokens[trace_id] = otel_context.attach(set_span_in_context(child_span))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

head -n 320 src/backend/base/langflow/services/tracing/arize_phoenix.py | tail -n +100

Repository: langflow-ai/langflow

Length of output: 9260


🏁 Script executed:

sed -n '260,320p' src/backend/base/langflow/services/tracing/arize_phoenix.py

Repository: langflow-ai/langflow

Length of output: 2442


🏁 Script executed:

sed -n '100,120p' src/backend/base/langflow/services/tracing/arize_phoenix.py

Repository: langflow-ai/langflow

Length of output: 1254


🌐 Web query:

OpenTelemetry Python contextvars attach detach token execution context async

💡 Result:

In OpenTelemetry Python, context propagation (across sync code and async/await) is implemented via a runtime context (typically contextvars), and the core API is:

  • opentelemetry.context.attach(ctx) -> Token: makes ctx the current context for the caller’s current execution unit (thread / async task) and returns a token representing the previous context. [1]
  • opentelemetry.context.detach(token): restores the context that was current before the matching attach(). Always pair it with attach() (use try/finally). [1][2]

Typical safe pattern (including async):

from opentelemetry import context, baggage

async def handler():
    ctx = baggage.set_baggage("user.id", "123")
    token = context.attach(ctx)
    try:
        await do_work()          # context preserved across async/await
        await do_more_work()
    finally:
        context.detach(token)    # restore previous context

If you don’t detach() after modifying/attaching context, you can get errors due to unmatched attach/detach (the SDK can detect leaked contexts). [3]

Sources: OpenTelemetry Python context API docs [1], OpenTelemetry Python context attach/detach guidance [2], example + warning about failing to detach [3].

[1] (opentelemetry-python.readthedocs.io)
[2] (deepwiki.com)
[3] (docs.honeycomb.io)

Citations:


🌐 Web query:

OpenTelemetry Python context.attach context.detach same thread async task

💡 Result:

In OpenTelemetry Python, context.attach(ctx)/context.detach(token) are implemented on top of Python contextvars, so the returned token is only valid in the same “current Context” (i.e., the same logical execution unit) where it was created. If you detach() in a different contextvars context, Python raises ValueError: Token was created in a different Context (or similar). [1][2]

What “same thread” means with asyncio

Even if everything runs on the same OS thread, different asyncio.Tasks can have different contextvars contexts, and the token must be detached in the same task / same context that did the attach. OpenTelemetry describes the scope as “same thread or async task.” [3]

Common ways this breaks

  • You attach() in one coroutine/task, but detach() runs:
    • in a different task (e.g., via create_task, callbacks, background tasks), or
    • during async-generator cleanup/GeneratorExit where finalization happens under a different context. This often shows up exactly as the “created in a different Context” error in streaming/generator scenarios. [4]

Safe pattern

Do detach(token) in a finally in the same coroutine call path that did attach():

from opentelemetry import context

token = context.attach(new_ctx)
try:
    await do_work()
finally:
    context.detach(token)

If you need the same context in a new task, propagate the context value into the task at creation time (rather than moving the token across tasks), and use task-local attach/detach there.

Sources
[1] OpenTelemetry Python Context API (attach/detach) (opentelemetry-python.readthedocs.io)
[2] PEP 567: Token.reset() raises ValueError if used in a different Context (peps.python.org)
[3] OpenTelemetry Python context behavior “same thread or async task” (deepwiki.com)
[4] Example async streaming/generator teardown causing “different Context” token detach error (github.com)

Citations:


Wrap token detach in try/finally to prevent context leaks on exceptions.

OpenTelemetry's attach()/detach() pattern requires detach to happen even if attribute-setting operations fail. Currently, exceptions during the output/log/status attribute operations (lines 306–314) skip the detach, leaving the context attached and causing resource leaks. Additionally, if add_trace() and end_trace() execute in different async tasks or contexts, the detach will raise ValueError because the token is context-scoped. Wrap the detach in a try/finally block within end_trace():

Suggested fix
try:
    self._set_span_status(child_span, error)
finally:
    token = self._context_tokens.pop(trace_id, None)
    if token is not None:
        from opentelemetry import context as otel_context
        otel_context.detach(token)
    child_span.end(end_time=self._get_current_timestamp())
    self.child_spans.pop(trace_id)

Also applies to: 305–311

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/backend/base/langflow/services/tracing/arize_phoenix.py` around lines 267
- 273, The context detach currently can be skipped or raise if exceptions occur;
modify end_trace() so that the span status/attribute setting (calls to
_set_span_status, any output/log/status operations) are executed inside a try
block and the context token detach and span cleanup happen in a finally: pop the
token from self._context_tokens using trace_id, call otel_context.detach(token)
only if token is not None (handle possible ValueError by swallowing/ignoring
it), then call child_span.end(end_time=self._get_current_timestamp()) and remove
the span from self.child_spans; this ensures attach() made in add_trace() is
always detached even on errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working community Pull Request from an external contributor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant