Streaming stops visually in chat UI while generation continues in background (OpenAI-compatible)

When using  OpenAI-compatible API endpoint (llama.cpp, ik_llama.cpp, llama-swap), streaming responses to a chat, the visible text output in the Cortex chat randomly freezes during generation. However, the underlying model continues generating tokens, and the full response is eventually received once the generation completes — it just isn’t displayed incrementally during the stall.

This creates a misleading user experience, as it appears the model has stopped responding, while in fact it’s still working.


- VS Code Version: latest
- OS Version: arch linux



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming stops visually in chat UI while generation continues in background (OpenAI-compatible) #37

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Streaming stops visually in chat UI while generation continues in background (OpenAI-compatible) #37

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions