Skip to content

Conversation

@chenghao-mou
Copy link
Member

@chenghao-mou chenghao-mou commented Jan 12, 2026

Summary by CodeRabbit

  • New Features

    • Sessions can optionally record internal events; reports now include serialized internal pipeline events, timestamps and duration.
    • Added LLMOutputEvent and several explicit typed event variants (speech, playback, text input, synthesized audio, flush sentinel).
  • Refactor

    • Unified internal-event surface and added collection hooks across voice/LLM/TTS/STT pipelines.
    • Run/session APIs updated to wire event collection into reporting and run results.

✏️ Tip: You can customize this high-level summary in your review settings.

@chenghao-mou chenghao-mou force-pushed the feat/export-internal-events branch from 67280ea to 08ffd59 Compare January 22, 2026 13:57
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 22, 2026

📝 Walkthrough

Walkthrough

Adds internal-event types and instrumentation across voice pipeline components, records optional internal events per AgentSession (via include_internal_events), extends RunResult/AgentSession wiring, and expands SessionReport to include, timestamp, and serialize collected internal events.

Changes

Cohort / File(s) Summary
Event/type additions & discriminants
livekit-agents/livekit/agents/llm/llm.py, livekit-agents/livekit/agents/llm/realtime.py, livekit-agents/livekit/agents/tts/tts.py, livekit-agents/livekit/agents/types.py, livekit-agents/livekit/agents/voice/io.py, livekit-agents/livekit/agents/voice/room_io/types.py
Added new dataclasses/types and explicit type: Literal[...] discriminants (e.g., LLMOutputEvent, FlushSentinel, SynthesizedAudio, playback/input event types). Review discriminators and serialization assumptions.
LLM public surface
livekit-agents/livekit/agents/llm/__init__.py
Export surface updated to include LLMOutputEvent (and adjust __all__).
Voice pipeline instrumentation
livekit-agents/livekit/agents/voice/agent.py, livekit-agents/livekit/agents/voice/agent_activity.py, livekit-agents/livekit/agents/voice/room_io/room_io.py
Inserted maybe_collect(...) calls to capture STT/LLM/TTS/IO outputs (frames, chunks, timed strings, text input). Verify these calls are non-blocking and payload shapes match serialization.
AgentSession & RunResult wiring
livekit-agents/livekit/agents/voice/agent_session.py, livekit-agents/livekit/agents/voice/run_result.py
Added internal-event buffers, _include_internal_events flag, _prev_audio_output listener management; AgentSession.start(...) gains include_internal_events; RunResult now requires agent_session and delegates event recording to call back into the session. Review listener lifecycle and public signature changes.
InternalEvent union & report serialization
livekit-agents/livekit/agents/voice/events.py, livekit-agents/livekit/agents/voice/report.py
Introduced InternalEvent discriminated union and extended SessionReport with include_internal_events, internal_events, timestamps, duration, and serialization helpers (AudioFrame → base64, TimedString → dict, LLM payload handling). Inspect serialization paths, filtering (e.g., VAD), and compatibility with previous report consumers.
Job reporting update
livekit-agents/livekit/agents/job.py
make_session_report now passes include_internal_events and internal_events from the session into the constructed SessionReport.

Sequence Diagram(s)

sequenceDiagram
    participant Pipeline as Pipeline (STT / LLM / TTS)
    participant Activity as AgentActivity
    participant Session as AgentSession
    participant Run as RunResult
    participant Report as SessionReport

    Pipeline->>Activity: emit events (chunks, frames, transcripts)
    Activity->>Session: maybe_collect(event)
    alt include_internal_events == true
        Session->>Session: append to _recorded_internal_events
    end
    Pipeline->>Session: playback_started / playback_finished (listener)
    Session->>Session: attach/detach listeners, maybe_collect(playback_event)
    Run->>Session: record RunEvent (via RunResult._record_event)
    Session->>Report: make_session_report(include_internal_events, internal_events, timestamps)
    Report->>Report: serialize events (AudioFrame -> base64, TimedString -> dict, LLM payloads)
    Report-->>Caller: serialized SessionReport
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • migrate to turn_handling #4502 — Modifies AgentSession public API/signatures and session-related behavior; closely related to the AgentSession/start and session state changes in this PR.

Poem

🐇 I hopped through frames, and chunks, and text,
Collected whispers, wired the rest—
Timed beats tucked in tidy rows,
Sessions now keep what the pipeline knows.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 26.19% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'Collect internal events for testing' is directly related to the main objective of this changeset, which introduces internal event collection, tracking, and serialization throughout the agents codebase via new instrumentation hooks and public API extensions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

🧹 Recent nitpick comments
livekit-agents/livekit/agents/voice/report.py (2)

122-129: Add a Google‑style docstring for _serialize_audio_frame.

This keeps the new helper aligned with the project’s docstring guidance.

📝 Docstring addition
     `@staticmethod`
     def _serialize_audio_frame(frame: AudioFrame) -> dict:
+        """Serialize an AudioFrame for JSON transport.
+
+        Args:
+            frame: The audio frame to serialize.
+
+        Returns:
+            JSON-serializable dict representation of the frame.
+        """
         return {
             "sample_rate": frame.sample_rate,
             "num_channels": frame.num_channels,
             "samples_per_channel": frame.samples_per_channel,
             "data": base64.b64encode(frame.data).decode("utf-8"),
         }
As per coding guidelines, please use Google‑style docstrings.

51-101: Reorder specialized type checks before the BaseModel catch-all to prevent future issues.

Currently, BaseModel is checked first, which works because SynthesizedAudio, LLMOutputEvent, VADEvent, and GenerationCreatedEvent are all standard Python dataclasses, not BaseModel subclasses. However, if any of these become BaseModel subclasses in the future, the isinstance(e, BaseModel) check would catch them first and bypass custom serialization logic designed to handle non-serializable fields (like AudioFrame and AsyncIterable). Move the specialized handlers before the BaseModel branch for defensive coding.

♻️ Suggested reorder
-                if isinstance(e, BaseModel):
-                    internal_events_dict.append(e.model_dump())
-                elif isinstance(e, SynthesizedAudio):
+                if isinstance(e, SynthesizedAudio):
                     # coming from TTS
                     data = asdict(e)
                     data["frame"] = self._serialize_audio_frame(e.frame)
                     internal_events_dict.append(data)
                 elif isinstance(e, LLMOutputEvent):
                     data = asdict(e)
                     if isinstance(e.data, AudioFrame):
                         data["data"] = self._serialize_audio_frame(e.data)
                     elif isinstance(e.data, str):
                         data["data"] = e.data
                     elif isinstance(e.data, TimedString):
                         data["data"] = e.data.to_dict()
                     elif isinstance(e.data, ChatChunk):
                         data["data"] = e.data.model_dump(mode="json")
                     internal_events_dict.append(data)
                 elif isinstance(e, VADEvent):
                     # skip inference done events, they are too frequent and too noisy
                     if e.type == VADEventType.INFERENCE_DONE:
                         continue
                     # remove audio frames from VAD event since we can reproduce them cheaply
                     data = asdict(e)
                     data["frames"] = []
                     internal_events_dict.append(data)
                     continue
                 elif isinstance(e, GenerationCreatedEvent):
                     # skip message_stream and function_stream as they are not serializable
                     data = {
                         "message_stream": None,
                         "function_stream": None,
                         "user_initiated": e.user_initiated,
                         "response_id": e.response_id,
                         "type": e.type,
                     }
                     internal_events_dict.append(data)
                     continue
+                elif isinstance(e, BaseModel):
+                    internal_events_dict.append(e.model_dump())
                 elif is_dataclass(e):
                     internal_events_dict.append(asdict(e))
📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ba90282 and 8df74be.

📒 Files selected for processing (1)
  • livekit-agents/livekit/agents/voice/report.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-agents/livekit/agents/voice/report.py
🧠 Learnings (1)
📚 Learning: 2026-01-22T03:28:16.289Z
Learnt from: longcw
Repo: livekit/agents PR: 4563
File: livekit-agents/livekit/agents/beta/tools/end_call.py:65-65
Timestamp: 2026-01-22T03:28:16.289Z
Learning: In code paths that check capabilities or behavior of the LLM processing the current interaction, prefer using the activity's LLM obtained via ctx.session.current_agent._get_activity_or_raise().llm instead of ctx.session.llm. The session-level LLM may be a fallback and not reflect the actual agent handling the interaction. Use the activity LLM to determine capabilities and to make capability checks or feature toggles relevant to the current processing agent.

Applied to files:

  • livekit-agents/livekit/agents/voice/report.py
🧬 Code graph analysis (1)
livekit-agents/livekit/agents/voice/report.py (5)
livekit-agents/livekit/agents/llm/chat_context.py (2)
  • ChatContext (218-656)
  • to_dict (402-441)
livekit-agents/livekit/agents/llm/realtime.py (1)
  • GenerationCreatedEvent (40-47)
livekit-agents/livekit/agents/tts/tts.py (1)
  • SynthesizedAudio (33-44)
livekit-agents/livekit/agents/types.py (2)
  • TimedString (95-128)
  • to_dict (119-128)
livekit-agents/livekit/agents/vad.py (2)
  • VADEvent (26-68)
  • VADEventType (19-22)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: livekit-plugins-deepgram
  • GitHub Check: unit-tests
  • GitHub Check: type-check (3.13)
  • GitHub Check: type-check (3.9)
🔇 Additional comments (2)
livekit-agents/livekit/agents/voice/report.py (2)

3-18: Imports align with the new internal-event serialization.

No issues spotted in this import update.


24-31: No API break: The only SessionReport instantiation uses keyword arguments.

The single instantiation in job.py:266 explicitly passes all arguments as keywords. Additionally, SessionReport is not exported in the module's __all__ lists, indicating it is an internal implementation detail rather than part of the public API. No other instantiation sites or external usage were found.

Likely an incorrect or invalid review comment.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@chenghao-mou chenghao-mou marked this pull request as ready for review January 22, 2026 16:35
@chenghao-mou chenghao-mou requested a review from a team January 22, 2026 16:35
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
livekit-agents/livekit/agents/voice/agent_session.py (1)

539-541: Reset _recorded_internal_events on session start.

_recorded_events is cleared, but internal events aren’t, so a restarted session can leak prior events into the report.

🔧 Proposed fix
             self._recorded_events = []
+            self._recorded_internal_events = []
             self._room_io = None
             self._recorder_io = None
🤖 Fix all issues with AI agents
In `@livekit-agents/livekit/agents/llm/__init__.py`:
- Around line 100-103: Export list includes RealtimeSessionRestoredEvent in
__all__ but the class is not imported or defined; either remove
"RealtimeSessionRestoredEvent" from the __all__ list or add/implement and import
the RealtimeSessionRestoredEvent class into this module (ensure the symbol name
matches exactly) so that the exported symbol resolves when accessed from the
package; update the __all__ array accordingly and ensure any new class is
imported alongside the other events (e.g., with RealtimeSessionReconnectedEvent,
LLMError, LLMOutputEvent) so consumers can import it without AttributeError.

In `@livekit-agents/livekit/agents/types.py`:
- Around line 119-126: The to_dict method returns sentinel NotGiven values which
break JSON serialization; update the to_dict implementation on the class with
the to_dict method to convert any NotGiven sentinel to None for each field
("text"/self, start_time, end_time, confidence, start_time_offset) before
returning the dict—import the NotGiven sentinel (or compare against it) and map
values equal to NotGiven to None so report.to_dict() yields JSON-serializable
primitives.

In `@livekit-agents/livekit/agents/voice/report.py`:
- Around line 71-78: The code scrubs audio by converting a VADEvent to a dict
via asdict(e) and then replaces the frames with an empty dict which changes the
type; change the scrub to set data["frames"] = [] so VADEvent.frames remains a
list (locate the block handling isinstance(e, VADEvent) where asdict(e) is used
and append to internal_events_dict).
🧹 Nitpick comments (4)
livekit-agents/livekit/agents/llm/llm.py (1)

75-83: Discriminant-data type correspondence is not enforced.

The type field indicates the expected data type, but nothing prevents constructing an LLMOutputEvent with mismatched type and data values (e.g., type="llm_chunk_output" with data=str). This pattern relies on caller discipline.

Consider using a factory method or overloaded constructors to ensure correctness, or document the expected correspondence clearly.

Example: Factory methods for type safety
`@dataclass`
class LLMOutputEvent:
    type: Literal[
        "llm_chunk_output",
        "llm_str_output",
        "llm_timed_string_output",
        "realtime_audio_output",
    ]
    data: ChatChunk | str | TimedString | rtc.AudioFrame

    `@classmethod`
    def from_chunk(cls, chunk: ChatChunk) -> "LLMOutputEvent":
        return cls(type="llm_chunk_output", data=chunk)

    `@classmethod`
    def from_str(cls, s: str) -> "LLMOutputEvent":
        return cls(type="llm_str_output", data=s)

    `@classmethod`
    def from_timed_string(cls, ts: TimedString) -> "LLMOutputEvent":
        return cls(type="llm_timed_string_output", data=ts)

    `@classmethod`
    def from_audio_frame(cls, frame: rtc.AudioFrame) -> "LLMOutputEvent":
        return cls(type="realtime_audio_output", data=frame)
livekit-agents/livekit/agents/voice/agent.py (1)

392-395: Collect-before-yield to avoid missing events on early cancellation.

Currently, maybe_collect(...) runs after yield, so if the consumer cancels/short-circuits, the last item can be dropped from internal events. Consider collecting before yield to make capture resilient to early exits.

♻️ Suggested pattern (apply to each generator)
-                    yield event
-                    activity.session.maybe_collect(event)
+                    activity.session.maybe_collect(event)
+                    yield event

Also applies to: 419-423, 452-455, 463-473, 485-489

livekit-agents/livekit/agents/voice/report.py (1)

108-115: Add a Google-style docstring for _serialize_audio_frame.

✍️ Docstring example
     `@staticmethod`
     def _serialize_audio_frame(frame: AudioFrame) -> dict:
+        """Serialize an AudioFrame to a JSON-friendly dict.
+
+        Args:
+            frame: The audio frame to serialize.
+
+        Returns:
+            A JSON-serializable dict with audio metadata and base64 data.
+        """
         return {
             "sample_rate": frame.sample_rate,
             "num_channels": frame.num_channels,
             "samples_per_channel": frame.samples_per_channel,
             "data": base64.b64encode(frame.data).decode("utf-8"),
         }
As per coding guidelines, add Google-style docstrings for new helpers.
livekit-agents/livekit/agents/voice/agent_session.py (1)

370-373: Avoid storing internal events when the flag is disabled.

Right now every emitted AgentEvent is appended to _recorded_internal_events, even when include_internal_events is off, which can double memory usage on long sessions.

🔧 Suggested guard
     def emit(self, event: EventTypes, arg: AgentEvent) -> None:  # type: ignore
         self._recorded_events.append(arg)
-        self._recorded_internal_events.append(arg)
+        if self._include_internal_events:
+            self._recorded_internal_events.append(arg)
         super().emit(event, arg)
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ffee15c and 1be38bf.

📒 Files selected for processing (15)
  • livekit-agents/livekit/agents/job.py
  • livekit-agents/livekit/agents/llm/__init__.py
  • livekit-agents/livekit/agents/llm/llm.py
  • livekit-agents/livekit/agents/llm/realtime.py
  • livekit-agents/livekit/agents/tts/tts.py
  • livekit-agents/livekit/agents/types.py
  • livekit-agents/livekit/agents/voice/agent.py
  • livekit-agents/livekit/agents/voice/agent_activity.py
  • livekit-agents/livekit/agents/voice/agent_session.py
  • livekit-agents/livekit/agents/voice/events.py
  • livekit-agents/livekit/agents/voice/io.py
  • livekit-agents/livekit/agents/voice/report.py
  • livekit-agents/livekit/agents/voice/room_io/room_io.py
  • livekit-agents/livekit/agents/voice/room_io/types.py
  • livekit-agents/livekit/agents/voice/run_result.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-agents/livekit/agents/tts/tts.py
  • livekit-agents/livekit/agents/voice/run_result.py
  • livekit-agents/livekit/agents/voice/io.py
  • livekit-agents/livekit/agents/job.py
  • livekit-agents/livekit/agents/llm/llm.py
  • livekit-agents/livekit/agents/voice/agent_activity.py
  • livekit-agents/livekit/agents/llm/__init__.py
  • livekit-agents/livekit/agents/voice/room_io/room_io.py
  • livekit-agents/livekit/agents/llm/realtime.py
  • livekit-agents/livekit/agents/voice/agent.py
  • livekit-agents/livekit/agents/voice/report.py
  • livekit-agents/livekit/agents/voice/room_io/types.py
  • livekit-agents/livekit/agents/voice/agent_session.py
  • livekit-agents/livekit/agents/types.py
  • livekit-agents/livekit/agents/voice/events.py
🧠 Learnings (2)
📚 Learning: 2026-01-22T03:28:16.289Z
Learnt from: longcw
Repo: livekit/agents PR: 4563
File: livekit-agents/livekit/agents/beta/tools/end_call.py:65-65
Timestamp: 2026-01-22T03:28:16.289Z
Learning: In code paths that check capabilities or behavior of the LLM processing the current interaction, prefer using the activity's LLM obtained via ctx.session.current_agent._get_activity_or_raise().llm instead of ctx.session.llm. The session-level LLM may be a fallback and not reflect the actual agent handling the interaction. Use the activity LLM to determine capabilities and to make capability checks or feature toggles relevant to the current processing agent.

Applied to files:

  • livekit-agents/livekit/agents/tts/tts.py
  • livekit-agents/livekit/agents/voice/run_result.py
  • livekit-agents/livekit/agents/voice/io.py
  • livekit-agents/livekit/agents/job.py
  • livekit-agents/livekit/agents/llm/llm.py
  • livekit-agents/livekit/agents/voice/agent_activity.py
  • livekit-agents/livekit/agents/llm/__init__.py
  • livekit-agents/livekit/agents/voice/room_io/room_io.py
  • livekit-agents/livekit/agents/llm/realtime.py
  • livekit-agents/livekit/agents/voice/agent.py
  • livekit-agents/livekit/agents/voice/report.py
  • livekit-agents/livekit/agents/voice/room_io/types.py
  • livekit-agents/livekit/agents/voice/agent_session.py
  • livekit-agents/livekit/agents/types.py
  • livekit-agents/livekit/agents/voice/events.py
📚 Learning: 2026-01-16T07:44:56.353Z
Learnt from: CR
Repo: livekit/agents PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-16T07:44:56.353Z
Learning: Implement Model Interface Pattern for STT, TTS, LLM, and Realtime models with provider-agnostic interfaces, fallback adapters for resilience, and stream adapters for different streaming patterns

Applied to files:

  • livekit-agents/livekit/agents/voice/agent.py
🧬 Code graph analysis (9)
livekit-agents/livekit/agents/voice/run_result.py (2)
livekit-agents/livekit/agents/voice/agent_session.py (1)
  • maybe_collect (375-378)
livekit-agents/livekit/agents/llm/chat_context.py (1)
  • insert (269-275)
livekit-agents/livekit/agents/llm/llm.py (1)
livekit-agents/livekit/agents/types.py (1)
  • TimedString (95-126)
livekit-agents/livekit/agents/voice/agent_activity.py (2)
livekit-agents/livekit/agents/voice/agent_session.py (2)
  • llm (1284-1285)
  • maybe_collect (375-378)
livekit-agents/livekit/agents/llm/realtime.py (1)
  • InputSpeechStartedEvent (20-21)
livekit-agents/livekit/agents/voice/room_io/room_io.py (3)
livekit-agents/livekit/agents/voice/room_io/types.py (1)
  • TextInputEvent (32-36)
livekit-agents/livekit/agents/llm/tool_context.py (1)
  • info (142-143)
livekit-agents/livekit/agents/voice/agent_session.py (1)
  • maybe_collect (375-378)
livekit-agents/livekit/agents/voice/agent.py (2)
livekit-agents/livekit/agents/voice/agent_activity.py (3)
  • llm (2794-2798)
  • session (237-238)
  • agent (241-242)
livekit-agents/livekit/agents/types.py (1)
  • TimedString (95-126)
livekit-agents/livekit/agents/voice/report.py (7)
livekit-agents/livekit/agents/voice/agent.py (3)
  • llm (538-548)
  • tts (551-561)
  • vad (577-587)
livekit-agents/livekit/agents/llm/llm.py (2)
  • ChatChunk (69-72)
  • LLMOutputEvent (76-83)
livekit-agents/livekit/agents/tts/tts.py (2)
  • SynthesizedAudio (33-44)
  • sample_rate (118-119)
livekit-agents/livekit/agents/types.py (2)
  • TimedString (95-126)
  • to_dict (119-126)
livekit-agents/livekit/agents/vad.py (2)
  • VADEvent (26-68)
  • VADEventType (19-22)
livekit-agents/livekit/agents/voice/room_io/room_io.py (1)
  • room (197-198)
livekit-agents/livekit/agents/voice/io.py (1)
  • sample_rate (243-245)
livekit-agents/livekit/agents/voice/agent_session.py (2)
livekit-agents/livekit/agents/voice/io.py (5)
  • AudioOutput (142-286)
  • audio (431-432)
  • audio (435-449)
  • audio (555-556)
  • audio (559-573)
livekit-agents/livekit/agents/voice/run_result.py (7)
  • event (584-585)
  • event (864-865)
  • event (1001-1002)
  • event (1011-1012)
  • event (1021-1022)
  • RunResult (71-229)
  • done (141-143)
livekit-agents/livekit/agents/types.py (3)
livekit-agents/livekit/agents/voice/report.py (1)
  • to_dict (40-106)
livekit-agents/livekit/agents/llm/chat_context.py (1)
  • to_dict (402-441)
livekit-agents/livekit/agents/stt/stt.py (2)
  • start_time_offset (295-296)
  • start_time_offset (299-302)
livekit-agents/livekit/agents/voice/events.py (5)
livekit-agents/livekit/agents/llm/realtime.py (4)
  • GenerationCreatedEvent (40-47)
  • InputSpeechStartedEvent (20-21)
  • InputSpeechStoppedEvent (25-27)
  • InputTranscriptionCompleted (127-133)
livekit-agents/livekit/agents/llm/llm.py (1)
  • LLMOutputEvent (76-83)
livekit-agents/livekit/agents/types.py (1)
  • FlushSentinel (38-39)
livekit-agents/livekit/agents/voice/io.py (2)
  • PlaybackFinishedEvent (119-127)
  • PlaybackStartedEvent (131-134)
livekit-agents/livekit/agents/voice/room_io/types.py (1)
  • TextInputEvent (32-36)
🔇 Additional comments (21)
livekit-agents/livekit/agents/voice/room_io/types.py (1)

31-36: LGTM!

The discriminant type field with a default value follows the pattern established for the InternalEvent discriminated union across the PR. Placing it as the last field avoids breaking positional argument compatibility.

livekit-agents/livekit/agents/tts/tts.py (1)

32-44: LGTM!

The discriminant field follows the established pattern and is correctly positioned as the last field with a default value, preserving backward compatibility for existing callers.

livekit-agents/livekit/agents/llm/__init__.py (1)

23-24: LGTM!

LLMOutputEvent is properly imported and will be correctly exported through __all__.

livekit-agents/livekit/agents/types.py (1)

37-39: LGTM!

Simple sentinel dataclass with discriminant field, consistent with the PR's pattern.

livekit-agents/livekit/agents/llm/realtime.py (5)

19-21: LGTM!

Discriminant field correctly added to InputSpeechStartedEvent.


24-28: LGTM!

Discriminant field correctly positioned after the required field.


30-36: LGTM!

Discriminant field correctly positioned as the last field in MessageGeneration.


39-47: LGTM!

Discriminant field correctly positioned after the optional response_id field in GenerationCreatedEvent.


126-133: LGTM!

Discriminant field correctly positioned as the last field in InputTranscriptionCompleted.

livekit-agents/livekit/agents/voice/room_io/room_io.py (1)

410-415: LGTM for text input event capture.

The event reuse plus collection hook reads cleanly and matches the internal-event flow.

livekit-agents/livekit/agents/voice/agent_activity.py (1)

1108-1110: Internal-event collection hooks look solid.

Consistent placement across handlers; no functional side effects detected.

Also applies to: 1122-1124, 1132-1134, 1145-1147, 1219-1223, 1233-1236, 1250-1252, 1269-1271, 1299-1301

livekit-agents/livekit/agents/voice/io.py (1)

118-128: LGTM for discriminant type fields.

This supports internal event typing without impacting runtime behavior.

Also applies to: 130-135

livekit-agents/livekit/agents/job.py (1)

266-278: LGTM for SessionReport internal-events wiring.

Looks consistent with the new collection path.

livekit-agents/livekit/agents/voice/run_result.py (1)

71-101: All RunResult instantiations already pass the required agent_session parameter.

Verified instantiations in agent_session.py (lines 446 and 652) both correctly provide agent_session=self. No runtime errors will occur from missing this parameter.

livekit-agents/livekit/agents/voice/events.py (2)

15-34: No review notes for the added event imports.


257-275: InternalEvent union looks consistent with the new event surface.

livekit-agents/livekit/agents/voice/agent_session.py (4)

52-52: No review notes for the new InternalEvent import.


362-365: Internal-event flag plumbing looks consistent.

Also applies to: 375-379, 463-506, 521-521


446-446: RunResult now wired with AgentSession for event collection.

Also applies to: 652-653


328-329: Playback listener attach/detach wiring is consistent.

Also applies to: 642-646, 804-807, 1322-1342

livekit-agents/livekit/agents/voice/report.py (1)

20-30: No action needed—all required SessionReport fields are already being passed at the single instantiation site.

The only call site in livekit/agents/job.py already provides include_internal_events and internal_events, so no defaults are necessary and no compatibility issues exist.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
livekit-agents/livekit/agents/voice/report.py (1)

34-36: duration and started_at fields are not included in to_dict() output.

The SessionReport dataclass defines duration and started_at fields (lines 34-36), but these are not serialized in the to_dict() method's return value. If these fields are intended for consumers, they should be added to the output dictionary.

🔧 Proposed fix (if intended for serialization)
         return {
             "job_id": self.job_id,
             "room_id": self.room_id,
             "room": self.room,
             "events": events_dict,
             "internal_events": internal_events_dict,
+            "duration": self.duration,
+            "started_at": self.started_at,
             "audio_recording_path": (

Also applies to: 82-106

🤖 Fix all issues with AI agents
In `@livekit-agents/livekit/agents/llm/realtime.py`:
- Around line 40-48: GenerationCreatedEvent is defined as a plain dataclass but
uses pydantic Field(..., exclude=True) for message_stream and function_stream
which doesn't prevent dataclasses.asdict() from attempting to deepcopy
AsyncIterable fields and will raise a TypeError; fix by either converting
GenerationCreatedEvent into a pydantic BaseModel (remove dataclass usage and use
BaseModel with Field(..., exclude=True) for message_stream and function_stream)
or keep it as a dataclass and update the serializer in report.py to special-case
GenerationCreatedEvent (do not call dataclasses.asdict() on that instance;
instead build a dict that omits message_stream and function_stream or replace
them with serializable placeholders) so as to avoid deepcopying unpicklable
AsyncIterable fields.

In `@livekit-agents/livekit/agents/types.py`:
- Around line 119-126: The to_dict method currently uses "or None" which turns
valid falsy numeric values (0, 0.0) into None; update to_dict to explicitly
check the sentinel (e.g., NotGiven) for each field (start_time, end_time,
confidence, start_time_offset) and only map to None when the attribute equals
the NotGiven sentinel, otherwise return the actual attribute value; reference
the to_dict method and the attribute names (start_time, end_time, confidence,
start_time_offset) when making the change.

In `@livekit-agents/livekit/agents/voice/agent_session.py`:
- Around line 370-378: The emit method appends to _recorded_internal_events
regardless of _include_internal_events, causing memory growth; update emit(self,
event: EventTypes, arg: AgentEvent) to only append arg to
_recorded_internal_events when self._include_internal_events is True (leave
_recorded_events append and super().emit intact), so internal events are
collected consistently with maybe_collect and the _include_internal_events flag.
♻️ Duplicate comments (1)
livekit-agents/livekit/agents/voice/report.py (1)

50-80: GenerationCreatedEvent will cause serialization failure.

The fallback asdict(e) on line 80 will be invoked for GenerationCreatedEvent, but this event contains AsyncIterable fields (message_stream, function_stream) that aren't serializable. This causes a TypeError when the report is generated.

🔧 Proposed fix
+from ..llm import GenerationCreatedEvent
...
             for e in self.internal_events:
                 if isinstance(e, BaseModel):
                     internal_events_dict.append(e.model_dump())
+                elif isinstance(e, GenerationCreatedEvent):
+                    internal_events_dict.append({
+                        "type": e.type,
+                        "user_initiated": e.user_initiated,
+                        "response_id": e.response_id,
+                    })
                 elif isinstance(e, SynthesizedAudio):
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1be38bf and 3c79092.

📒 Files selected for processing (6)
  • livekit-agents/livekit/agents/llm/__init__.py
  • livekit-agents/livekit/agents/llm/realtime.py
  • livekit-agents/livekit/agents/types.py
  • livekit-agents/livekit/agents/voice/agent_session.py
  • livekit-agents/livekit/agents/voice/events.py
  • livekit-agents/livekit/agents/voice/report.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-agents/livekit/agents/llm/__init__.py
  • livekit-agents/livekit/agents/types.py
  • livekit-agents/livekit/agents/voice/report.py
  • livekit-agents/livekit/agents/llm/realtime.py
  • livekit-agents/livekit/agents/voice/agent_session.py
  • livekit-agents/livekit/agents/voice/events.py
🧠 Learnings (1)
📚 Learning: 2026-01-22T03:28:16.289Z
Learnt from: longcw
Repo: livekit/agents PR: 4563
File: livekit-agents/livekit/agents/beta/tools/end_call.py:65-65
Timestamp: 2026-01-22T03:28:16.289Z
Learning: In code paths that check capabilities or behavior of the LLM processing the current interaction, prefer using the activity's LLM obtained via ctx.session.current_agent._get_activity_or_raise().llm instead of ctx.session.llm. The session-level LLM may be a fallback and not reflect the actual agent handling the interaction. Use the activity LLM to determine capabilities and to make capability checks or feature toggles relevant to the current processing agent.

Applied to files:

  • livekit-agents/livekit/agents/llm/__init__.py
  • livekit-agents/livekit/agents/types.py
  • livekit-agents/livekit/agents/voice/report.py
  • livekit-agents/livekit/agents/llm/realtime.py
  • livekit-agents/livekit/agents/voice/agent_session.py
  • livekit-agents/livekit/agents/voice/events.py
🧬 Code graph analysis (4)
livekit-agents/livekit/agents/voice/report.py (5)
livekit-agents/livekit/agents/voice/agent_session.py (5)
  • llm (1285-1286)
  • tts (1289-1290)
  • vad (1293-1294)
  • AgentSessionOptions (78-93)
  • options (408-409)
livekit-agents/livekit/agents/llm/llm.py (2)
  • ChatChunk (69-72)
  • LLMOutputEvent (76-83)
livekit-agents/livekit/agents/llm/chat_context.py (2)
  • ChatContext (218-656)
  • to_dict (402-441)
livekit-agents/livekit/agents/tts/tts.py (1)
  • SynthesizedAudio (33-44)
livekit-agents/livekit/agents/types.py (2)
  • TimedString (95-126)
  • to_dict (119-126)
livekit-agents/livekit/agents/llm/realtime.py (1)
livekit-agents/livekit/agents/llm/chat_context.py (1)
  • FunctionCall (179-192)
livekit-agents/livekit/agents/voice/agent_session.py (2)
livekit-agents/livekit/agents/voice/io.py (5)
  • AudioOutput (142-286)
  • audio (431-432)
  • audio (435-449)
  • audio (555-556)
  • audio (559-573)
livekit-agents/livekit/agents/voice/run_result.py (7)
  • event (584-585)
  • event (864-865)
  • event (1001-1002)
  • event (1011-1012)
  • event (1021-1022)
  • RunResult (71-229)
  • done (141-143)
livekit-agents/livekit/agents/voice/events.py (7)
livekit-agents/livekit/agents/llm/llm.py (2)
  • LLMOutputEvent (76-83)
  • LLM (98-162)
livekit-agents/livekit/agents/stt/stt.py (1)
  • SpeechEvent (70-74)
livekit-agents/livekit/agents/tts/tts.py (1)
  • SynthesizedAudio (33-44)
livekit-agents/livekit/agents/types.py (1)
  • FlushSentinel (38-39)
livekit-agents/livekit/agents/vad.py (1)
  • VADEvent (26-68)
livekit-agents/livekit/agents/voice/io.py (2)
  • PlaybackFinishedEvent (119-127)
  • PlaybackStartedEvent (131-134)
livekit-agents/livekit/agents/voice/room_io/types.py (1)
  • TextInputEvent (32-36)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: livekit-plugins-cartesia
  • GitHub Check: livekit-plugins-openai
  • GitHub Check: livekit-plugins-deepgram
  • GitHub Check: unit-tests
  • GitHub Check: type-check (3.9)
  • GitHub Check: type-check (3.13)
🔇 Additional comments (7)
livekit-agents/livekit/agents/llm/__init__.py (1)

23-23: LGTM!

The LLMOutputEvent is properly imported from the .llm module and correctly added to the __all__ exports list, making it publicly accessible from the llm package.

Also applies to: 102-102

livekit-agents/livekit/agents/types.py (1)

37-39: LGTM!

The FlushSentinel dataclass with the type discriminator field enables proper participation in the InternalEvent discriminated union.

livekit-agents/livekit/agents/llm/realtime.py (1)

19-36: LGTM!

The type discriminator fields added to InputSpeechStartedEvent, InputSpeechStoppedEvent, MessageGeneration, and InputTranscriptionCompleted correctly enable these events to participate in the discriminated InternalEvent union.

Also applies to: 126-134

livekit-agents/livekit/agents/voice/agent_session.py (1)

643-646: LGTM!

The audio output event listener lifecycle management is well-implemented:

  • Listeners are attached when audio output is set during start
  • Properly detached during close and when audio output changes
  • Re-attached when a new audio output is configured

Also applies to: 805-808, 1324-1342

livekit-agents/livekit/agents/voice/report.py (1)

108-115: LGTM!

The _serialize_audio_frame helper correctly serializes AudioFrame to a dictionary with base64-encoded audio data.

livekit-agents/livekit/agents/voice/events.py (2)

257-275: LGTM!

The InternalEvent discriminated union is well-structured. All member types have the required type discriminator field. Pydantic handles the nested AgentEvent union by flattening it during discrimination.


216-221: LGTM!

Adding Field(..., exclude=True) to ErrorEvent.source is appropriate since the field contains non-serializable model instances (LLM, STT, TTS, RealtimeModel).

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
livekit-agents/livekit/agents/llm/realtime.py (1)

126-134: Align InputTranscriptionCompleted.type with EventTypes.
EventTypes includes "input_audio_transcription_completed", but the new discriminator is "input_transcription_completed". This mismatch can break consumers relying on the discriminator string.

🛠️ Proposed fix
-    type: Literal["input_transcription_completed"] = "input_transcription_completed"
+    type: Literal["input_audio_transcription_completed"] = "input_audio_transcription_completed"
♻️ Duplicate comments (1)
livekit-agents/livekit/agents/voice/report.py (1)

40-89: Avoid asdict() on GenerationCreatedEvent AsyncIterables.
asdict() deep-copies fields before you clear message_stream/function_stream, so it can still raise on AsyncIterable values. Build the dict directly (or replace streams before conversion) to avoid runtime failures when internal events are enabled.

🔧 Proposed fix
-                elif isinstance(e, GenerationCreatedEvent):
-                    data = asdict(e)
-                    data["message_stream"] = []
-                    data["function_stream"] = []
-                    internal_events_dict.append(data)
-                    continue
+                elif isinstance(e, GenerationCreatedEvent):
+                    internal_events_dict.append(
+                        {
+                            "type": e.type,
+                            "user_initiated": e.user_initiated,
+                            "response_id": e.response_id,
+                            "message_stream": [],
+                            "function_stream": [],
+                        }
+                    )
+                    continue
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f6d7e59 and ba90282.

📒 Files selected for processing (3)
  • livekit-agents/livekit/agents/llm/realtime.py
  • livekit-agents/livekit/agents/voice/agent_session.py
  • livekit-agents/livekit/agents/voice/report.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-agents/livekit/agents/voice/agent_session.py
  • livekit-agents/livekit/agents/voice/report.py
  • livekit-agents/livekit/agents/llm/realtime.py
🧠 Learnings (1)
📚 Learning: 2026-01-22T03:28:16.289Z
Learnt from: longcw
Repo: livekit/agents PR: 4563
File: livekit-agents/livekit/agents/beta/tools/end_call.py:65-65
Timestamp: 2026-01-22T03:28:16.289Z
Learning: In code paths that check capabilities or behavior of the LLM processing the current interaction, prefer using the activity's LLM obtained via ctx.session.current_agent._get_activity_or_raise().llm instead of ctx.session.llm. The session-level LLM may be a fallback and not reflect the actual agent handling the interaction. Use the activity LLM to determine capabilities and to make capability checks or feature toggles relevant to the current processing agent.

Applied to files:

  • livekit-agents/livekit/agents/voice/agent_session.py
  • livekit-agents/livekit/agents/voice/report.py
  • livekit-agents/livekit/agents/llm/realtime.py
🧬 Code graph analysis (2)
livekit-agents/livekit/agents/voice/agent_session.py (2)
livekit-agents/livekit/agents/voice/io.py (5)
  • AudioOutput (142-286)
  • audio (431-432)
  • audio (435-449)
  • audio (555-556)
  • audio (559-573)
livekit-agents/livekit/agents/voice/run_result.py (7)
  • event (584-585)
  • event (864-865)
  • event (1001-1002)
  • event (1011-1012)
  • event (1021-1022)
  • RunResult (71-229)
  • done (141-143)
livekit-agents/livekit/agents/voice/report.py (6)
livekit-agents/livekit/agents/voice/agent_session.py (4)
  • llm (1286-1287)
  • tts (1290-1291)
  • vad (1294-1295)
  • AgentSessionOptions (78-93)
livekit-agents/livekit/agents/llm/llm.py (2)
  • ChatChunk (69-72)
  • LLMOutputEvent (76-83)
livekit-agents/livekit/agents/llm/chat_context.py (2)
  • ChatContext (218-656)
  • to_dict (402-441)
livekit-agents/livekit/agents/llm/realtime.py (1)
  • GenerationCreatedEvent (40-47)
livekit-agents/livekit/agents/tts/tts.py (1)
  • SynthesizedAudio (33-44)
livekit-agents/livekit/agents/types.py (2)
  • TimedString (95-128)
  • to_dict (119-128)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: livekit-plugins-deepgram
  • GitHub Check: livekit-plugins-cartesia
  • GitHub Check: unit-tests
  • GitHub Check: type-check (3.13)
🔇 Additional comments (12)
livekit-agents/livekit/agents/llm/realtime.py (4)

19-22: LGTM: discriminator added for InputSpeechStartedEvent.


24-28: LGTM: discriminator added for InputSpeechStoppedEvent.


30-36: LGTM: discriminator added for MessageGeneration.


39-47: LGTM: discriminator added for GenerationCreatedEvent.

livekit-agents/livekit/agents/voice/report.py (3)

3-18: LGTM: imports updated for internal event serialization.


20-38: LGTM: SessionReport fields extended for internal events and timestamps.


90-123: LGTM: internal_events included in output and audio frames serialized.

livekit-agents/livekit/agents/voice/agent_session.py (5)

44-56: LGTM: internal event tracking state wired into session.

Also applies to: 328-365


370-379: LGTM: emit() and maybe_collect() respect include_internal_events.


443-449: LGTM: RunResult now carries agent_session for event collection.

Also applies to: 654-655


452-542: LGTM: include_internal_events surfaced in start() and stored.


644-648: LGTM: playback listeners are attached/detached cleanly.

Also applies to: 806-809, 1325-1344

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants