fix(tracing): nest LangChain spans under Langflow component spans in Phoenix by 0xNekr · Pull Request #12008 · langflow-ai/langflow

0xNekr · 2026-03-03T18:44:56Z

Summary

LangChain auto-instrumented spans (created by LangChainInstrumentor) appear as separate top-level traces in Phoenix/Arize instead of being nested under the Langflow component span that triggered them.

This happens because add_trace creates a child span via the propagator context but doesn't attach it to the active OpenTelemetry context. When LangChain runs (e.g. an OpenAI call inside a component), the instrumentor sees no active parent span and creates a new root trace.

Before

The Traces tab in Phoenix shows N+1 entries per flow execution: one root "Langflow" span plus separate disconnected LangChain spans (OpenAI, RunnableSequence, etc.).

After

The Traces tab shows 1 entry per flow execution. All LangChain-instrumented spans are properly nested under the Langflow component span that triggered them.

Changes

In arize_phoenix.py:

__init__: Added _context_tokens dict to track OTel context tokens per trace
add_trace: After creating the child span, attach it to the OTel context so LangChain instrumentor picks it up as the parent
end_trace: Detach from OTel context before ending the span to restore the previous context

Test plan

Deploy Langflow with Phoenix tracing enabled and LangChainInstrumentor active
Execute a flow that includes LLM calls (e.g. OpenAI via LangChain)
Open Phoenix Traces tab and verify:
- Only one root trace per flow execution (not multiple disconnected traces)
- LangChain spans (OpenAI, RunnableSequence, etc.) appear nested under the component span
- Spans tab still shows all individual spans with correct timing

Summary by CodeRabbit

Chores
- Enhanced internal tracing infrastructure to improve observability and reliability of trace management.

…Phoenix LangChain auto-instrumented spans (from LangChainInstrumentor) appear as separate top-level traces in Phoenix/Arize because they don't inherit the OpenTelemetry parent context from the Langflow component span. This attaches each child span to the OTel context during component execution (add_trace) and detaches it when the component finishes (end_trace). LangChain-instrumented spans (e.g. OpenAI calls, RunnableSequence) automatically become children of the active Langflow component span. Before: Traces tab shows N+1 entries per flow execution (one root Langflow span + separate LangChain spans). After: Traces tab shows 1 entry per flow execution with all spans properly nested under the root trace.

coderabbitai · 2026-03-03T18:45:25Z

Walkthrough

OTEL context token management is added to the tracing service. The implementation introduces internal state tracking to store context tokens per trace ID and manages their attachment/detachment during trace lifecycle operations.

Changes

Cohort / File(s)	Summary
OTEL Context Token Management `src/backend/base/langflow/services/tracing/arize_phoenix.py`	Introduces `_context_tokens` dictionary for tracking OTEL context tokens per trace ID. Adds token storage during span creation and retrieval/detachment during trace end operations.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 2 warnings)

Check name	Status	Explanation	Resolution
Test Coverage For New Implementations	❌ Error	PR modifies arize_phoenix.py to add OpenTelemetry context tracking but includes no test coverage for the new _context_tokens functionality or context attach/detach operations.	Add test_arize_phoenix.py covering context token storage, OTel context attachment after add_trace, context detachment during end_trace, and span nesting verification.
Test Quality And Coverage	⚠️ Warning	PR introduces critical OpenTelemetry context token management and span nesting in ArizePhoenixTracer but provides no dedicated unit tests; existing tests use mocks that don't exercise the new functionality.	Add comprehensive unit tests for ArizePhoenixTracer in test_arize_phoenix.py covering context token tracking, OTel attachment/detachment, exception handling, and concurrent trace scenarios.
Test File Naming And Structure	⚠️ Warning	Pull request lacks test coverage for new OTEL context management functionality in arize_phoenix.py, violating established repository testing patterns.	Create test file test_arize_phoenix.py with tests for context attachment/detachment, error handling, edge cases, and implement try/finally pattern for detach calls.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: fixing span nesting in Phoenix tracing by ensuring LangChain spans appear under Langflow component spans, which directly matches the PR's core objective.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Excessive Mock Usage Warning	✅ Passed	This custom check on excessive mock usage in tests is not applicable to this PR because no test files were modified or created in these changes.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/backend/base/langflow/services/tracing/arize_phoenix.py (1)

291-313: ⚠️ Potential issue | 🟠 Major

Move detach/end cleanup into a finally block to avoid context/span leaks on mid-method exceptions.

If serialization or set_attribute fails before Line 305, cleanup is skipped and the active OTel context can remain attached.

💡 Proposed hardening diff

@@
-        processed_outputs = self._convert_to_arize_phoenix_types(outputs) if outputs else {}
-        if processed_outputs:
-            child_span.set_attribute(SpanAttributes.OUTPUT_VALUE, self._safe_json_dumps(processed_outputs))
-            child_span.set_attribute(SpanAttributes.OUTPUT_MIME_TYPE, OpenInferenceMimeTypeValues.JSON.value)
-
-        logs_dicts = [log if isinstance(log, dict) else log.model_dump() for log in logs]
-        processed_logs = (
-            self._convert_to_arize_phoenix_types({log.get("name"): log for log in logs_dicts}) if logs else {}
-        )
-        if processed_logs:
-            child_span.set_attribute("logs", self._safe_json_dumps(processed_logs))
-
-        self._set_span_status(child_span, error)
-
-        # Detach from OTel context before ending the span
-        token = self._context_tokens.pop(trace_id, None)
-        if token is not None:
-            from opentelemetry import context as otel_context
-
-            otel_context.detach(token)
-
-        child_span.end(end_time=self._get_current_timestamp())
-        self.child_spans.pop(trace_id)
+        from opentelemetry import context as otel_context
+        try:
+            processed_outputs = self._convert_to_arize_phoenix_types(outputs) if outputs else {}
+            if processed_outputs:
+                child_span.set_attribute(SpanAttributes.OUTPUT_VALUE, self._safe_json_dumps(processed_outputs))
+                child_span.set_attribute(SpanAttributes.OUTPUT_MIME_TYPE, OpenInferenceMimeTypeValues.JSON.value)
+
+            logs_dicts = [log if isinstance(log, dict) else log.model_dump() for log in logs]
+            processed_logs = (
+                self._convert_to_arize_phoenix_types({log.get("name"): log for log in logs_dicts}) if logs else {}
+            )
+            if processed_logs:
+                child_span.set_attribute("logs", self._safe_json_dumps(processed_logs))
+
+            self._set_span_status(child_span, error)
+        finally:
+            token = self._context_tokens.pop(trace_id, None)
+            if token is not None:
+                otel_context.detach(token)
+            child_span.end(end_time=self._get_current_timestamp())
+            self.child_spans.pop(trace_id, None)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/backend/base/langflow/services/tracing/arize_phoenix.py` around lines 291
- 313, The cleanup that detaches OTel context and ends/removes child spans must
run even if attribute-setting or serialization fails: wrap the block that sets
attributes (calls to _convert_to_arize_phoenix_types, _safe_json_dumps,
child_span.set_attribute and _set_span_status) in a try/finally so the finally
always runs the detach (use self._context_tokens.pop(trace_id) and call
otel_context.detach(token) if token), child_span.end(end_time=...) and
self.child_spans.pop(trace_id). Ensure any imports (from opentelemetry import
context as otel_context) remain accessible in the finally and avoid swallowing
the original exception (re-raise after cleanup).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/backend/base/langflow/services/tracing/arize_phoenix.py`:
- Around line 267-273: The context detach currently can be skipped or raise if
exceptions occur; modify end_trace() so that the span status/attribute setting
(calls to _set_span_status, any output/log/status operations) are executed
inside a try block and the context token detach and span cleanup happen in a
finally: pop the token from self._context_tokens using trace_id, call
otel_context.detach(token) only if token is not None (handle possible ValueError
by swallowing/ignoring it), then call
child_span.end(end_time=self._get_current_timestamp()) and remove the span from
self.child_spans; this ensures attach() made in add_trace() is always detached
even on errors.

---

Outside diff comments:
In `@src/backend/base/langflow/services/tracing/arize_phoenix.py`:
- Around line 291-313: The cleanup that detaches OTel context and ends/removes
child spans must run even if attribute-setting or serialization fails: wrap the
block that sets attributes (calls to _convert_to_arize_phoenix_types,
_safe_json_dumps, child_span.set_attribute and _set_span_status) in a
try/finally so the finally always runs the detach (use
self._context_tokens.pop(trace_id) and call otel_context.detach(token) if
token), child_span.end(end_time=...) and self.child_spans.pop(trace_id). Ensure
any imports (from opentelemetry import context as otel_context) remain
accessible in the finally and avoid swallowing the original exception (re-raise
after cleanup).

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2fc6ca8 and aaf1ebb.

📒 Files selected for processing (1)

src/backend/base/langflow/services/tracing/arize_phoenix.py

coderabbitai · 2026-03-03T18:53:11Z

src/backend/base/langflow/services/tracing/arize_phoenix.py

+        # Attach child span to OTel context so LangChain auto-instrumented
+        # spans (via LangChainInstrumentor) become children of this span
+        from opentelemetry import context as otel_context
+        from opentelemetry.trace import set_span_in_context
+
+        self._context_tokens[trace_id] = otel_context.attach(set_span_in_context(child_span))
+


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

head -n 320 src/backend/base/langflow/services/tracing/arize_phoenix.py | tail -n +100

Repository: langflow-ai/langflow

Length of output: 9260

🏁 Script executed:

sed -n '260,320p' src/backend/base/langflow/services/tracing/arize_phoenix.py

Repository: langflow-ai/langflow

Length of output: 2442

🏁 Script executed:

sed -n '100,120p' src/backend/base/langflow/services/tracing/arize_phoenix.py

Repository: langflow-ai/langflow

Length of output: 1254

🌐 Web query:

OpenTelemetry Python contextvars attach detach token execution context async

💡 Result:

In OpenTelemetry Python, context propagation (across sync code and async/await) is implemented via a runtime context (typically contextvars), and the core API is:

opentelemetry.context.attach(ctx) -> Token: makes ctx the current context for the caller’s current execution unit (thread / async task) and returns a token representing the previous context. [1]

opentelemetry.context.detach(token): restores the context that was current before the matching attach(). Always pair it with attach() (use try/finally). [1][2]

Typical safe pattern (including async):

from opentelemetry import context, baggage async def handler(): ctx = baggage.set_baggage("user.id", "123") token = context.attach(ctx) try: await do_work() # context preserved across async/await await do_more_work() finally: context.detach(token) # restore previous context

If you don’t detach() after modifying/attaching context, you can get errors due to unmatched attach/detach (the SDK can detect leaked contexts). [3]

Sources: OpenTelemetry Python context API docs [1], OpenTelemetry Python context attach/detach guidance [2], example + warning about failing to detach [3].

[1] (opentelemetry-python.readthedocs.io)
[2] (deepwiki.com)
[3] (docs.honeycomb.io)

Citations:

1: https://opentelemetry-python.readthedocs.io/en/latest/api/context.html?utm_source=openai

2: https://deepwiki.com/open-telemetry/opentelemetry-python/8.1-testing?utm_source=openai

3: https://docs.honeycomb.io/send-data/python/opentelemetry-sdk/?utm_source=openai

🌐 Web query:

OpenTelemetry Python context.attach context.detach same thread async task

💡 Result:

In OpenTelemetry Python, context.attach(ctx)/context.detach(token) are implemented on top of Python contextvars, so the returned token is only valid in the same “current Context” (i.e., the same logical execution unit) where it was created. If you detach() in a different contextvars context, Python raises ValueError: Token was created in a different Context (or similar). [1][2]

What “same thread” means with asyncio

Even if everything runs on the same OS thread, different asyncio.Tasks can have different contextvars contexts, and the token must be detached in the same task / same context that did the attach. OpenTelemetry describes the scope as “same thread or async task.” [3]

Common ways this breaks

You attach() in one coroutine/task, but detach() runs:

in a different task (e.g., via create_task, callbacks, background tasks), or

during async-generator cleanup/GeneratorExit where finalization happens under a different context. This often shows up exactly as the “created in a different Context” error in streaming/generator scenarios. [4]

Safe pattern

Do detach(token) in a finally in the same coroutine call path that did attach():

from opentelemetry import context token = context.attach(new_ctx) try: await do_work() finally: context.detach(token)

If you need the same context in a new task, propagate the context value into the task at creation time (rather than moving the token across tasks), and use task-local attach/detach there.

Sources
[1] OpenTelemetry Python Context API (attach/detach) (opentelemetry-python.readthedocs.io)
[2] PEP 567: Token.reset() raises ValueError if used in a different Context (peps.python.org)
[3] OpenTelemetry Python context behavior “same thread or async task” (deepwiki.com)
[4] Example async streaming/generator teardown causing “different Context” token detach error (github.com)

Citations:

1: https://opentelemetry-python.readthedocs.io/en/latest/api/context.html?utm_source=openai

2: https://peps.python.org/pep-0567/?utm_source=openai

3: https://deepwiki.com/open-telemetry/opentelemetry-python/8.1-testing?utm_source=openai

4: [Bug] Context detach error when using OpenTelemetry instrumentation with streaming in Team agno-agi/agno#5208

Wrap token detach in try/finally to prevent context leaks on exceptions.

OpenTelemetry's attach()/detach() pattern requires detach to happen even if attribute-setting operations fail. Currently, exceptions during the output/log/status attribute operations (lines 306–314) skip the detach, leaving the context attached and causing resource leaks. Additionally, if add_trace() and end_trace() execute in different async tasks or contexts, the detach will raise ValueError because the token is context-scoped. Wrap the detach in a try/finally block within end_trace():

Suggested fix

try: self._set_span_status(child_span, error) finally: token = self._context_tokens.pop(trace_id, None) if token is not None: from opentelemetry import context as otel_context otel_context.detach(token) child_span.end(end_time=self._get_current_timestamp()) self.child_spans.pop(trace_id)

Also applies to: 305–311

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/backend/base/langflow/services/tracing/arize_phoenix.py` around lines 267 - 273, The context detach currently can be skipped or raise if exceptions occur; modify end_trace() so that the span status/attribute setting (calls to _set_span_status, any output/log/status operations) are executed inside a try block and the context token detach and span cleanup happen in a finally: pop the token from self._context_tokens using trace_id, call otel_context.detach(token) only if token is not None (handle possible ValueError by swallowing/ignoring it), then call child_span.end(end_time=self._get_current_timestamp()) and remove the span from self.child_spans; this ensures attach() made in add_trace() is always detached even on errors.

github-actions bot added the community Pull Request from an external contributor label Mar 3, 2026

github-actions bot added bug Something isn't working and removed bug Something isn't working labels Mar 3, 2026

coderabbitai bot reviewed Mar 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(tracing): nest LangChain spans under Langflow component spans in Phoenix#12008

fix(tracing): nest LangChain spans under Langflow component spans in Phoenix#12008
0xNekr wants to merge 1 commit intolangflow-ai:mainfrom
0xNekr:fix/phoenix-span-nesting

0xNekr commented Mar 3, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 3, 2026 •

edited

Loading

Pre-merge checks failed

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

0xNekr commented Mar 3, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Before

After

Changes

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Pre-merge checks failed

❌ Failed checks (1 error, 2 warnings)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 3, 2026

Choose a reason for hiding this comment

What “same thread” means with asyncio

Common ways this breaks

Safe pattern

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

0xNekr commented Mar 3, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 3, 2026 •

edited

Loading

What “same thread” means with `asyncio`