1.5 KiB
1.5 KiB
Llama Local Runner — Tool Quality Test Results
Highscores
Performance Rankings (Duration)
| Test | Model | Duration (ms) | Duration (s) |
|---|---|---|---|
| tool-add | default | 12126 | 12.13 |
| tool-multiply | default | 10678 | 10.68 |
| tool-weather | default | 10144 | 10.14 |
| tool-selection | default | 15522 | 15.52 |
Summary
- Total Tests: 4
- Passed: 4
- Failed: 0
- Success Rate: 100.00%
- Average Duration: 12118ms (12.12s)
Failed Tests
No failed tests
Passed Tests
tool-add - default
- Prompt:
Use the add tool to calculate 15 plus 27. Return the result. - Expected:
42 - Actual:
The sum of 15 and 27 is 42. - Duration: 12126ms (12.13s)
- Timestamp: 3/19/2026, 6:39:17 PM
tool-multiply - default
- Prompt:
Use the multiply tool to calculate 8 times 9. Return the result. - Expected:
72 - Actual:
The result of 8 times 9 is 72. - Duration: 10678ms (10.68s)
- Timestamp: 3/19/2026, 6:39:27 PM
tool-weather - default
- Prompt:
What is the weather like in Berlin? Use the get_weather tool. - Expected:
berlin - Actual:
The current temperature in Berlin is 18 degrees Celsius and the weather condition is cloudy. - Duration: 10144ms (10.14s)
- Timestamp: 3/19/2026, 6:39:38 PM
tool-selection - default
- Prompt:
Use the add tool to calculate 100 plus 200. Do not use any other tool. - Expected:
300 - Actual:
The sum of 100 and 200 is 300. - Duration: 15522ms (15.52s)
- Timestamp: 3/19/2026, 6:39:53 PM