Skip to content

perplexity_research (sonar-deep-research) fails with fetch error due to missing stream:true #83

@flavio-bongiovanni

Description

@flavio-bongiovanni

Bug Description

The perplexity_research tool (which uses sonar-deep-research model) fails with TypeError: fetch failed for any non-trivial query that triggers actual multi-step research.

Root Cause

performChatCompletion() in server.js sends requests without stream: true. For fast models (sonar, sonar-reasoning-pro), this works fine because they respond within seconds. However, sonar-deep-research performs multi-step web research that can take minutes. Without streaming:

  1. The API receives the request and begins deep research
  2. The HTTP connection sits idle while the model works
  3. Cloudflare drops the connection at exactly 60 seconds of inactivity
  4. The client receives SocketError: UND_ERR_SOCKET with bytesRead: 0

Evidence

Without streaming (fails):

$ curl -s -o /dev/null -w "HTTP:%{http_code} TIME:%{time_total}s SIZE:%{size_download}" \
  --max-time 120 \
  -X POST "https://api.perplexity.ai/chat/completions" \
  -H "Authorization: Bearer $PERPLEXITY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"sonar-deep-research","messages":[{"role":"user","content":"Compare SQLite-vec and ChromaDB briefly"}]}'

HTTP:000 TIME:60.092740s SIZE:0  (exit code 56 - connection reset)

With streaming (works):

$ curl -s --max-time 300 \
  -X POST "https://api.perplexity.ai/chat/completions" \
  -H "Authorization: Bearer $PERPLEXITY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"sonar-deep-research","messages":[{"role":"user","content":"Compare SQLite-vec and ChromaDB briefly"}],"stream":true}'

# Returns 1.1MB response with 29 search queries, 350k reasoning tokens
# Keep-alive pings (": ping - ...") prevent Cloudflare timeout

Simple queries (no research needed) succeed without streaming because the model returns in ~2 seconds, before the 60-second Cloudflare timeout:

$ curl ... -d '{"model":"sonar-deep-research","messages":[{"role":"user","content":"What is 2+2?"}]}'
# HTTP:200 TIME:2.3s — no actual research triggered

Node.js native fetch also confirms the issue:

// Complex query → SocketError: UND_ERR_SOCKET, bytesRead: 0
// Simple query  → Success (responds in <3s)

Suggested Fix

In performChatCompletion(), when the model is sonar-deep-research, add stream: true to the request body and consume the SSE chunks, collecting the final response before returning. The SSE keep-alive pings (: ping) sent by the Perplexity API will prevent Cloudflare from dropping the connection.

Something like:

const body = {
  model: model,
  messages: messages,
  ...(model === "sonar-deep-research" && { stream: true }),
};

Then handle the SSE response by buffering chunks and assembling the final message.

Environment

  • MCP Server: @perplexity-ai/mcp-server v0.6.2
  • Node.js: v24.13.0
  • OS: Linux (WSL2)
  • Other tools work fine: perplexity_search, perplexity_ask, perplexity_reason all succeed consistently

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions