feat: Ads/eng 4109 implement ralphhook for iterative agent refinement #309
+464
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
[HOOKS] ralph hook
Key Changes:
ralphloop (IE here) into the SDK, both are about iterative refinement with observability, but they achieve it differently and this attempt aims to bridge a gap and follows same structure asbackoff_on_error,summarize_when_longGenerationEndevents and intercepting completion attempts (responses without tool calls). When the agent produces a final answer, ralph_hookuses the SDK's scorer composition (avg()to combine multiple scorers) to evaluate output quality, then returnsRetryWithFeedbackif the score is below threshold—injecting feedback as a user message and forcing regeneration to create an iterative refinement loop. The agent's reaction processor prioritizes these reactions, continuing until output meets the quality threshold (Finish) or max iterations is reached (Fail). Each session keeps isolated state via ULID keys, and the hook resets on StepStart to avoid interfering with multi-step reasoning.Added:
Generated Summary:
Summary of Changes
ralph_hookthat implements iterative agent refinement based on scoring thresholds.summarize_when_longfunction to include apreserve_tool_pairsoption, ensuring that tool call/response pairs are kept together during summarization.Key Modifications
New Hook -
ralph_hook:Summarization Enhancement:
preserve_tool_pairsto ensure that tool call/response pairs are not orphaned during summarization.preserve_tool_pairsis set to true.Testing:
ralph_hook, covering various scenarios including convergence, multiple scorers, maximum iteration limits, and session isolation.preserve_tool_pairsfunctionality to validate correct behavior in different message scenarios.Potential Impact
This summary was generated with ❤️ by rigging