Skip to content

Conversation

@akihikokuroda
Copy link
Contributor

@akihikokuroda akihikokuroda commented Jan 26, 2026

Misc PR

Type of PR

  • Bug Fix
  • New Feature
  • Documentation
  • Other

Description

Testing

  • Tests added to the respective file if code was changed
  • New code has 100% coverage if code as added
  • Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

@mergify
Copy link

mergify bot commented Jan 26, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert|release)(?:\(.+\))?:

@github-actions
Copy link
Contributor

The PR description has been updated. Please fill out the template for your PR to be reviewed.

@psschwei
Copy link
Member

@akihikokuroda looks like there's a test failure (and needs a rebase on main)

@psschwei
Copy link
Member

otherwise, I think this LGTM

@psschwei
Copy link
Member

Saw that #380 was opened yesterday, so should confirm with @HendrikStrobelt that the approach here is compatible with that PR

@akihikokuroda
Copy link
Contributor Author

@psschwei Minor merges are necessary because same files are changed but they look compatible.

Copy link
Contributor

@jakelorocco jakelorocco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! A few questions / concerns.

Comment on lines 595 to 609
@pytest.mark.integration
class TestToolValidationIntegration:
"""Integration tests that would use actual validation function."""

@pytest.mark.skip(reason="Validation function not yet implemented")
def test_validation_with_coercion(self):
"""Test validation with type coercion enabled."""
# This test will be enabled once validation is implemented
pass

@pytest.mark.skip(reason="Validation function not yet implemented")
def test_validation_strict_mode(self):
"""Test validation in strict mode."""
# This test will be enabled once validation is implemented
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand on what this means for these two tests? Is there additional validation logic that will be enabled later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can take them out for now. Intention is these two skipped tests are placeholders for future integration tests that would verify the validate_tool_arguments function works correctly in real-world scenarios:

  1. test_validation_with_coercion
    Purpose: Test that the validation function properly coerces types when coerce_types=True is enabled.

What it would test: When LLMs return tool arguments as strings (e.g., "30" instead of 30), this test would verify that the validation function automatically converts them to the correct types (int, float, bool, etc.) before passing to the tool function.

Example scenario: An LLM calls a tool with {"age": "30", "score": "95.5"} but the tool expects int and float. The validation should coerce these strings to proper numeric types.

  1. test_validation_strict_mode
    Purpose: Test that strict validation mode properly raises errors when arguments don't match expected types.

What it would test: When strict=True, the validation function should raise ValidationError for type mismatches or missing required parameters, rather than silently returning the original arguments.

Example scenario: An LLM provides {"age": "not_a_number"} when an int is required. In strict mode, this should raise an error immediately rather than passing invalid data to the tool.

Current Status: Both tests are skipped because the validate_tool_arguments function implementation is not yet complete. However, the file contains extensive working tests (lines 91-356) that already cover these scenarios in detail, suggesting the validation function may actually be implemented. The skipped tests at the top appear to be legacy placeholders that should either be:

Removed (if functionality is already tested elsewhere)
Updated to test specific integration scenarios not covered by existing tests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can take them out for now. Intention is these two skipped tests are placeholders for future integration tests that would verify the validate_tool_arguments function works correctly in real-world scenarios:

test_validation_with_coercion
Purpose: Test that the validation function properly coerces types when coerce_types=True is enabled.
What it would test: When LLMs return tool arguments as strings (e.g., "30" instead of 30), this test would verify that the validation function automatically converts them to the correct types (int, float, bool, etc.) before passing to the tool function.

Example scenario: An LLM calls a tool with {"age": "30", "score": "95.5"} but the tool expects int and float. The validation should coerce these strings to proper numeric types.

test_validation_strict_mode
Purpose: Test that strict validation mode properly raises errors when arguments don't match expected types.
What it would test: When strict=True, the validation function should raise ValidationError for type mismatches or missing required parameters, rather than silently returning the original arguments.

Example scenario: An LLM provides {"age": "not_a_number"} when an int is required. In strict mode, this should raise an error immediately rather than passing invalid data to the tool.

Current Status: Both tests are skipped because the validate_tool_arguments function implementation is not yet complete. However, the file contains extensive working tests (lines 91-356) that already cover these scenarios in detail, suggesting the validation function may actually be implemented. The skipped tests at the top appear to be legacy placeholders that should either be:

Removed (if functionality is already tested elsewhere)
Updated to test specific integration scenarios not covered by existing tests

Okay, I think we should go ahead and remove these for now then. I believe our tests that check for tool call requests from a model also call those tools to ensure they function properly.

Comment on lines +607 to +612
# Validate and coerce argument types
validated_args = validate_tool_arguments(func, args, strict=False)
model_tool_calls[tool_name] = ModelToolCall(
tool_name, func, validated_args
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were you able to verify that these backends can generate incorrect parameters? I know huggingface has some issues with its tool requests that could benefit from this, but I'm not sure the other backends need any validation (but that was an open question I wasn't certain on).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember if I saw the backend generated incorrect parameters but it is good to validate them before making tool calls. We can not assume all backends working correctly :-)

Comment on lines +147 to +221
def validate_tool_arguments(
func: Callable,
args: Mapping[str, Any],
*,
coerce_types: bool = True,
strict: bool = False,
) -> dict[str, Any]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the changes for this PR and Hendrik's; if Hendrik's gets merged, first you'll just have to change this func: Callable to be a func: MelleaTool since not all of our tools going forward will be simple functions. Then the validation becomes a bit trickier but MelleaTool.as_json_tool should give you the parameter names and types that matter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw the PR. Right now, the parameter information is from the function signature only. I understand that some refactor is necessary.

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

improve handling of generated inputs for tool calls

3 participants