feat: add tool calling argument validation #364

akihikokuroda · 2026-01-26T23:30:00Z

Misc PR

Type of PR

Bug Fix
New Feature
Documentation
Other

Description

Link to Issue: fix: improve handling of generated inputs for tool calls #251

Testing

Tests added to the respective file if code was changed
New code has 100% coverage if code as added
Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

mergify · 2026-01-26T23:30:34Z

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert|release)(?:\(.+\))?:

github-actions · 2026-01-26T23:30:39Z

The PR description has been updated. Please fill out the template for your PR to be reviewed.

psschwei · 2026-01-28T19:22:39Z

@akihikokuroda looks like there's a test failure (and needs a rebase on main)

psschwei · 2026-01-28T19:40:55Z

otherwise, I think this LGTM

psschwei · 2026-01-29T12:38:23Z

Saw that #380 was opened yesterday, so should confirm with @HendrikStrobelt that the approach here is compatible with that PR

akihikokuroda · 2026-01-29T14:55:03Z

@psschwei Minor merges are necessary because same files are changed but they look compatible.

jakelorocco

Looks good! A few questions / concerns.

jakelorocco · 2026-01-29T20:22:55Z

test/backends/test_tool_argument_validation.py

+@pytest.mark.integration
+class TestToolValidationIntegration:
+    """Integration tests that would use actual validation function."""
+
+    @pytest.mark.skip(reason="Validation function not yet implemented")
+    def test_validation_with_coercion(self):
+        """Test validation with type coercion enabled."""
+        # This test will be enabled once validation is implemented
+        pass
+
+    @pytest.mark.skip(reason="Validation function not yet implemented")
+    def test_validation_strict_mode(self):
+        """Test validation in strict mode."""
+        # This test will be enabled once validation is implemented
+        pass


Can you expand on what this means for these two tests? Is there additional validation logic that will be enabled later?

I can take them out for now. Intention is these two skipped tests are placeholders for future integration tests that would verify the validate_tool_arguments function works correctly in real-world scenarios:

test_validation_with_coercion
Purpose: Test that the validation function properly coerces types when coerce_types=True is enabled.

What it would test: When LLMs return tool arguments as strings (e.g., "30" instead of 30), this test would verify that the validation function automatically converts them to the correct types (int, float, bool, etc.) before passing to the tool function.

Example scenario: An LLM calls a tool with {"age": "30", "score": "95.5"} but the tool expects int and float. The validation should coerce these strings to proper numeric types.

test_validation_strict_mode
Purpose: Test that strict validation mode properly raises errors when arguments don't match expected types.

What it would test: When strict=True, the validation function should raise ValidationError for type mismatches or missing required parameters, rather than silently returning the original arguments.

Example scenario: An LLM provides {"age": "not_a_number"} when an int is required. In strict mode, this should raise an error immediately rather than passing invalid data to the tool.

Current Status: Both tests are skipped because the validate_tool_arguments function implementation is not yet complete. However, the file contains extensive working tests (lines 91-356) that already cover these scenarios in detail, suggesting the validation function may actually be implemented. The skipped tests at the top appear to be legacy placeholders that should either be:

Removed (if functionality is already tested elsewhere)
Updated to test specific integration scenarios not covered by existing tests

I can take them out for now. Intention is these two skipped tests are placeholders for future integration tests that would verify the validate_tool_arguments function works correctly in real-world scenarios:

test_validation_with_coercion
Purpose: Test that the validation function properly coerces types when coerce_types=True is enabled.
What it would test: When LLMs return tool arguments as strings (e.g., "30" instead of 30), this test would verify that the validation function automatically converts them to the correct types (int, float, bool, etc.) before passing to the tool function.

Example scenario: An LLM calls a tool with {"age": "30", "score": "95.5"} but the tool expects int and float. The validation should coerce these strings to proper numeric types.

test_validation_strict_mode
Purpose: Test that strict validation mode properly raises errors when arguments don't match expected types.
What it would test: When strict=True, the validation function should raise ValidationError for type mismatches or missing required parameters, rather than silently returning the original arguments.

Example scenario: An LLM provides {"age": "not_a_number"} when an int is required. In strict mode, this should raise an error immediately rather than passing invalid data to the tool.

Current Status: Both tests are skipped because the validate_tool_arguments function implementation is not yet complete. However, the file contains extensive working tests (lines 91-356) that already cover these scenarios in detail, suggesting the validation function may actually be implemented. The skipped tests at the top appear to be legacy placeholders that should either be:

Removed (if functionality is already tested elsewhere)
Updated to test specific integration scenarios not covered by existing tests

Okay, I think we should go ahead and remove these for now then. I believe our tests that check for tool call requests from a model also call those tools to ensure they function properly.

jakelorocco · 2026-01-29T20:25:47Z

mellea/backends/litellm.py

+                # Validate and coerce argument types
+                validated_args = validate_tool_arguments(func, args, strict=False)
+                model_tool_calls[tool_name] = ModelToolCall(
+                    tool_name, func, validated_args
+                )


Were you able to verify that these backends can generate incorrect parameters? I know huggingface has some issues with its tool requests that could benefit from this, but I'm not sure the other backends need any validation (but that was an open question I wasn't certain on).

I don't remember if I saw the backend generated incorrect parameters but it is good to validate them before making tool calls. We can not assume all backends working correctly :-)

jakelorocco · 2026-01-29T20:27:08Z

mellea/backends/tools.py

+def validate_tool_arguments(
+    func: Callable,
+    args: Mapping[str, Any],
+    *,
+    coerce_types: bool = True,
+    strict: bool = False,
+) -> dict[str, Any]:


I looked at the changes for this PR and Hendrik's; if Hendrik's gets merged, first you'll just have to change this func: Callable to be a func: MelleaTool since not all of our tools going forward will be simple functions. Then the validation becomes a bit trickier but MelleaTool.as_json_tool should give you the parameter names and types that matter.

I saw the PR. Right now, the parameter information is from the function signature only. I understand that some refactor is necessary.

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

akihikokuroda force-pushed the toolcallvalidation branch from 98465c2 to 3337332 Compare January 28, 2026 23:58

jakelorocco reviewed Jan 29, 2026

View reviewed changes

akihikokuroda force-pushed the toolcallvalidation branch from b15c99c to 36745d8 Compare January 29, 2026 21:05

akihikokuroda added 4 commits January 29, 2026 17:39

add tool calling argument validation

c59f7c9

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

fix merge error

33079e2

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

fix failing tests

d04ec3c

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

review comments

2a0b2aa

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

akihikokuroda force-pushed the toolcallvalidation branch from 7522672 to 2a0b2aa Compare January 29, 2026 22:39

feat: add tool calling argument validation #364

Are you sure you want to change the base?

feat: add tool calling argument validation #364

Conversation

akihikokuroda commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Misc PR

Type of PR

Description

Testing

Uh oh!

mergify bot commented Jan 26, 2026

Merge Protections

🟢 Enforce conventional commit

Uh oh!

github-actions bot commented Jan 26, 2026

Uh oh!

psschwei commented Jan 28, 2026

Uh oh!

psschwei commented Jan 28, 2026

Uh oh!

psschwei commented Jan 29, 2026

Uh oh!

akihikokuroda commented Jan 29, 2026

Uh oh!

jakelorocco left a comment

Choose a reason for hiding this comment

Uh oh!

jakelorocco Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

akihikokuroda Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

jakelorocco Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

jakelorocco Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

akihikokuroda Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

jakelorocco Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

akihikokuroda Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

akihikokuroda commented Jan 26, 2026 •

edited

Loading