Skip to content

Conversation

@spitters
Copy link

Summary

This PR adds timeout support to prevent the pet process from hanging indefinitely when executing slow tactics (e.g., native_compute, hammer).

Problem

When a heavy tactic like native_compute or hammer is executed on a large term, the pet process can hang for hours while consuming excessive CPU and memory. The pytanque client has no way to detect or recover from this situation - it blocks forever on readline().

Solution

Add select()-based read timeout to _read_lsp_response():

  • Use select.select() to wait for data with configurable timeout
  • Kill pet process if it doesn't respond within timeout
  • Check if pet process died before each read operation
  • Propagate read_timeout parameter through query() and run() methods

Changes to client.py

  • Add select module import for non-blocking I/O
  • Add read_timeout parameter to _read_lsp_response() (default 120s)
  • Add read_timeout parameter to query() method
  • Add read_timeout parameter to run() method

Usage

# Normal operation with default 120s timeout
state = client.run(state, "ring.")

# Heavy tactic with extended timeout
state = client.run(state, "hammer.", read_timeout=300)

# Quick check with short timeout
state = client.run(state, "auto.", read_timeout=30)

Related

This is a companion change to rocq-mcp PR #2 which exposes the read_timeout parameter through the MCP tool interface.

Test plan

  • Verified normal tactic execution still works
  • Verified timeout parameter is properly passed through
  • Test with intentionally slow tactic to verify timeout behavior

🤖 Generated with Claude Code

This PR adds timeout support to prevent the pet process from hanging
indefinitely when executing slow tactics (e.g., native_compute, hammer).

Changes to client.py:
- Add `select` module import for non-blocking I/O
- Add `read_timeout` parameter to `_read_lsp_response()` (default 120s)
- Use `select.select()` to wait for data with timeout
- Kill pet process if it doesn't respond within timeout
- Check if pet process died before each read operation
- Add `read_timeout` parameter to `query()` method
- Add `read_timeout` parameter to `run()` method

The timeout is a hard timeout - if pet doesn't respond within the
specified time, the process is killed to prevent resource exhaustion.

Usage:
```python
# Normal operation with default 120s timeout
state = client.run(state, "ring.")

# Heavy tactic with extended timeout
state = client.run(state, "hammer.", read_timeout=300)

# Quick check with short timeout
state = client.run(state, "auto.", read_timeout=30)
```

This is a companion change to rocq-mcp PR LLM4Rocq#2 which exposes the
read_timeout parameter through the MCP tool interface.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@gbdrt
Copy link
Member

gbdrt commented Jan 19, 2026

Thanks for the PR.

Out of curiosity did you get this error on MacOS?
We had similar issues, where Rocq's Timeout was not interrupting tactics (e.g., vm_compute on big expressions), but not on all systems...

Unfortunately, there is no easy way to restore the states when you kill the server.
@theostos is currently developing a new server on top of Pytanque to gracefully handle crashes, timeout, and restart: https://github.com/LLM4Rocq/rocq-ml-toolbox

Maybe that will solve your issue?
The goal is to keep the pytanque client as lightweight as possible and to merge the new server for more complex logic.

@spitters
Copy link
Author

It's an old linux machine.
I also noticed that with some computations (1000² I think). MCP returned more than 100k tokens, which can get very expensive. It's probably good to cap this in a similar way.

Do you recommend moving to ml-toolbox?

@gbdrt
Copy link
Member

gbdrt commented Jan 22, 2026

Yes I would give it a try if you want something that can scale.

@spitters
Copy link
Author

spitters commented Jan 22, 2026

@gbdrt Could ml-toolbox also be wrapped into MCP? Or do you have a different interaction in mind?

BTW a similar crash happened on a far more powerful machine when using MCP with many subagents for parallel processing.

@gbdrt
Copy link
Member

gbdrt commented Jan 22, 2026

Unfortunately, that is not surprising since RocqMCP is a wrapper around pytanque.
That was one of the main motivation for the development of the new scalable server.

And yes we would also like to expose this new server using MCP.

@spitters
Copy link
Author

Do you have an ETA for that?
Since the previous one was AI generated, maybe it is possible again ?

@theostos
Copy link
Collaborator

Hi @spitters, I'm working on updating pytanque so that it works transparently with both servers (petanque and rocq-ml-server). If I'm able to make this work, then rocq-mcp should work out of the box!
I don't have a precise ETA yet, but my goal is to have the first fully compatible release of rocq-ml-server/pytanque within a few weeks. (After that, I plan to rewrite LLM4Docq based on these new tools to make it compatible with any Coq/Rocq 8.20+ library)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants