fix(tracing): nest LangChain spans under Langflow component spans in Phoenix#12008
fix(tracing): nest LangChain spans under Langflow component spans in Phoenix#120080xNekr wants to merge 1 commit intolangflow-ai:mainfrom
Conversation
…Phoenix LangChain auto-instrumented spans (from LangChainInstrumentor) appear as separate top-level traces in Phoenix/Arize because they don't inherit the OpenTelemetry parent context from the Langflow component span. This attaches each child span to the OTel context during component execution (add_trace) and detaches it when the component finishes (end_trace). LangChain-instrumented spans (e.g. OpenAI calls, RunnableSequence) automatically become children of the active Langflow component span. Before: Traces tab shows N+1 entries per flow execution (one root Langflow span + separate LangChain spans). After: Traces tab shows 1 entry per flow execution with all spans properly nested under the root trace.
WalkthroughOTEL context token management is added to the tracing service. The implementation introduces internal state tracking to store context tokens per trace ID and manages their attachment/detachment during trace lifecycle operations. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 2 warnings)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/backend/base/langflow/services/tracing/arize_phoenix.py (1)
291-313:⚠️ Potential issue | 🟠 MajorMove detach/end cleanup into a
finallyblock to avoid context/span leaks on mid-method exceptions.If serialization or
set_attributefails before Line 305, cleanup is skipped and the active OTel context can remain attached.💡 Proposed hardening diff
@@ - processed_outputs = self._convert_to_arize_phoenix_types(outputs) if outputs else {} - if processed_outputs: - child_span.set_attribute(SpanAttributes.OUTPUT_VALUE, self._safe_json_dumps(processed_outputs)) - child_span.set_attribute(SpanAttributes.OUTPUT_MIME_TYPE, OpenInferenceMimeTypeValues.JSON.value) - - logs_dicts = [log if isinstance(log, dict) else log.model_dump() for log in logs] - processed_logs = ( - self._convert_to_arize_phoenix_types({log.get("name"): log for log in logs_dicts}) if logs else {} - ) - if processed_logs: - child_span.set_attribute("logs", self._safe_json_dumps(processed_logs)) - - self._set_span_status(child_span, error) - - # Detach from OTel context before ending the span - token = self._context_tokens.pop(trace_id, None) - if token is not None: - from opentelemetry import context as otel_context - - otel_context.detach(token) - - child_span.end(end_time=self._get_current_timestamp()) - self.child_spans.pop(trace_id) + from opentelemetry import context as otel_context + try: + processed_outputs = self._convert_to_arize_phoenix_types(outputs) if outputs else {} + if processed_outputs: + child_span.set_attribute(SpanAttributes.OUTPUT_VALUE, self._safe_json_dumps(processed_outputs)) + child_span.set_attribute(SpanAttributes.OUTPUT_MIME_TYPE, OpenInferenceMimeTypeValues.JSON.value) + + logs_dicts = [log if isinstance(log, dict) else log.model_dump() for log in logs] + processed_logs = ( + self._convert_to_arize_phoenix_types({log.get("name"): log for log in logs_dicts}) if logs else {} + ) + if processed_logs: + child_span.set_attribute("logs", self._safe_json_dumps(processed_logs)) + + self._set_span_status(child_span, error) + finally: + token = self._context_tokens.pop(trace_id, None) + if token is not None: + otel_context.detach(token) + child_span.end(end_time=self._get_current_timestamp()) + self.child_spans.pop(trace_id, None)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/backend/base/langflow/services/tracing/arize_phoenix.py` around lines 291 - 313, The cleanup that detaches OTel context and ends/removes child spans must run even if attribute-setting or serialization fails: wrap the block that sets attributes (calls to _convert_to_arize_phoenix_types, _safe_json_dumps, child_span.set_attribute and _set_span_status) in a try/finally so the finally always runs the detach (use self._context_tokens.pop(trace_id) and call otel_context.detach(token) if token), child_span.end(end_time=...) and self.child_spans.pop(trace_id). Ensure any imports (from opentelemetry import context as otel_context) remain accessible in the finally and avoid swallowing the original exception (re-raise after cleanup).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/backend/base/langflow/services/tracing/arize_phoenix.py`:
- Around line 267-273: The context detach currently can be skipped or raise if
exceptions occur; modify end_trace() so that the span status/attribute setting
(calls to _set_span_status, any output/log/status operations) are executed
inside a try block and the context token detach and span cleanup happen in a
finally: pop the token from self._context_tokens using trace_id, call
otel_context.detach(token) only if token is not None (handle possible ValueError
by swallowing/ignoring it), then call
child_span.end(end_time=self._get_current_timestamp()) and remove the span from
self.child_spans; this ensures attach() made in add_trace() is always detached
even on errors.
---
Outside diff comments:
In `@src/backend/base/langflow/services/tracing/arize_phoenix.py`:
- Around line 291-313: The cleanup that detaches OTel context and ends/removes
child spans must run even if attribute-setting or serialization fails: wrap the
block that sets attributes (calls to _convert_to_arize_phoenix_types,
_safe_json_dumps, child_span.set_attribute and _set_span_status) in a
try/finally so the finally always runs the detach (use
self._context_tokens.pop(trace_id) and call otel_context.detach(token) if
token), child_span.end(end_time=...) and self.child_spans.pop(trace_id). Ensure
any imports (from opentelemetry import context as otel_context) remain
accessible in the finally and avoid swallowing the original exception (re-raise
after cleanup).
| # Attach child span to OTel context so LangChain auto-instrumented | ||
| # spans (via LangChainInstrumentor) become children of this span | ||
| from opentelemetry import context as otel_context | ||
| from opentelemetry.trace import set_span_in_context | ||
|
|
||
| self._context_tokens[trace_id] = otel_context.attach(set_span_in_context(child_span)) | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
head -n 320 src/backend/base/langflow/services/tracing/arize_phoenix.py | tail -n +100Repository: langflow-ai/langflow
Length of output: 9260
🏁 Script executed:
sed -n '260,320p' src/backend/base/langflow/services/tracing/arize_phoenix.pyRepository: langflow-ai/langflow
Length of output: 2442
🏁 Script executed:
sed -n '100,120p' src/backend/base/langflow/services/tracing/arize_phoenix.pyRepository: langflow-ai/langflow
Length of output: 1254
🌐 Web query:
OpenTelemetry Python contextvars attach detach token execution context async
💡 Result:
In OpenTelemetry Python, context propagation (across sync code and async/await) is implemented via a runtime context (typically contextvars), and the core API is:
opentelemetry.context.attach(ctx) -> Token: makesctxthe current context for the caller’s current execution unit (thread / async task) and returns a token representing the previous context. [1]opentelemetry.context.detach(token): restores the context that was current before the matchingattach(). Always pair it withattach()(usetry/finally). [1][2]
Typical safe pattern (including async):
from opentelemetry import context, baggage
async def handler():
ctx = baggage.set_baggage("user.id", "123")
token = context.attach(ctx)
try:
await do_work() # context preserved across async/await
await do_more_work()
finally:
context.detach(token) # restore previous contextIf you don’t detach() after modifying/attaching context, you can get errors due to unmatched attach/detach (the SDK can detect leaked contexts). [3]
Sources: OpenTelemetry Python context API docs [1], OpenTelemetry Python context attach/detach guidance [2], example + warning about failing to detach [3].
[1] (opentelemetry-python.readthedocs.io)
[2] (deepwiki.com)
[3] (docs.honeycomb.io)
Citations:
- 1: https://opentelemetry-python.readthedocs.io/en/latest/api/context.html?utm_source=openai
- 2: https://deepwiki.com/open-telemetry/opentelemetry-python/8.1-testing?utm_source=openai
- 3: https://docs.honeycomb.io/send-data/python/opentelemetry-sdk/?utm_source=openai
🌐 Web query:
OpenTelemetry Python context.attach context.detach same thread async task
💡 Result:
In OpenTelemetry Python, context.attach(ctx)/context.detach(token) are implemented on top of Python contextvars, so the returned token is only valid in the same “current Context” (i.e., the same logical execution unit) where it was created. If you detach() in a different contextvars context, Python raises ValueError: Token was created in a different Context (or similar). [1][2]
What “same thread” means with asyncio
Even if everything runs on the same OS thread, different asyncio.Tasks can have different contextvars contexts, and the token must be detached in the same task / same context that did the attach. OpenTelemetry describes the scope as “same thread or async task.” [3]
Common ways this breaks
- You
attach()in one coroutine/task, butdetach()runs:- in a different task (e.g., via
create_task, callbacks, background tasks), or - during async-generator cleanup/
GeneratorExitwhere finalization happens under a different context. This often shows up exactly as the “created in a different Context” error in streaming/generator scenarios. [4]
- in a different task (e.g., via
Safe pattern
Do detach(token) in a finally in the same coroutine call path that did attach():
from opentelemetry import context
token = context.attach(new_ctx)
try:
await do_work()
finally:
context.detach(token)If you need the same context in a new task, propagate the context value into the task at creation time (rather than moving the token across tasks), and use task-local attach/detach there.
Sources
[1] OpenTelemetry Python Context API (attach/detach) (opentelemetry-python.readthedocs.io)
[2] PEP 567: Token.reset() raises ValueError if used in a different Context (peps.python.org)
[3] OpenTelemetry Python context behavior “same thread or async task” (deepwiki.com)
[4] Example async streaming/generator teardown causing “different Context” token detach error (github.com)
Citations:
- 1: https://opentelemetry-python.readthedocs.io/en/latest/api/context.html?utm_source=openai
- 2: https://peps.python.org/pep-0567/?utm_source=openai
- 3: https://deepwiki.com/open-telemetry/opentelemetry-python/8.1-testing?utm_source=openai
- 4: [Bug] Context detach error when using OpenTelemetry instrumentation with streaming in Team agno-agi/agno#5208
Wrap token detach in try/finally to prevent context leaks on exceptions.
OpenTelemetry's attach()/detach() pattern requires detach to happen even if attribute-setting operations fail. Currently, exceptions during the output/log/status attribute operations (lines 306–314) skip the detach, leaving the context attached and causing resource leaks. Additionally, if add_trace() and end_trace() execute in different async tasks or contexts, the detach will raise ValueError because the token is context-scoped. Wrap the detach in a try/finally block within end_trace():
Suggested fix
try:
self._set_span_status(child_span, error)
finally:
token = self._context_tokens.pop(trace_id, None)
if token is not None:
from opentelemetry import context as otel_context
otel_context.detach(token)
child_span.end(end_time=self._get_current_timestamp())
self.child_spans.pop(trace_id)Also applies to: 305–311
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/backend/base/langflow/services/tracing/arize_phoenix.py` around lines 267
- 273, The context detach currently can be skipped or raise if exceptions occur;
modify end_trace() so that the span status/attribute setting (calls to
_set_span_status, any output/log/status operations) are executed inside a try
block and the context token detach and span cleanup happen in a finally: pop the
token from self._context_tokens using trace_id, call otel_context.detach(token)
only if token is not None (handle possible ValueError by swallowing/ignoring
it), then call child_span.end(end_time=self._get_current_timestamp()) and remove
the span from self.child_spans; this ensures attach() made in add_trace() is
always detached even on errors.
Summary
LangChain auto-instrumented spans (created by
LangChainInstrumentor) appear as separate top-level traces in Phoenix/Arize instead of being nested under the Langflow component span that triggered them.This happens because
add_tracecreates a child span via the propagator context but doesn't attach it to the active OpenTelemetry context. When LangChain runs (e.g. an OpenAI call inside a component), the instrumentor sees no active parent span and creates a new root trace.Before
The Traces tab in Phoenix shows N+1 entries per flow execution: one root "Langflow" span plus separate disconnected LangChain spans (OpenAI, RunnableSequence, etc.).
After
The Traces tab shows 1 entry per flow execution. All LangChain-instrumented spans are properly nested under the Langflow component span that triggered them.
Changes
In
arize_phoenix.py:__init__: Added_context_tokensdict to track OTel context tokens per traceadd_trace: After creating the child span, attach it to the OTel context so LangChain instrumentor picks it up as the parentend_trace: Detach from OTel context before ending the span to restore the previous contextTest plan
LangChainInstrumentoractiveSummary by CodeRabbit