mono/packages/kbot/tests/unit/reports/llama-tools.md
2026-03-19 18:40:35 +01:00

1.5 KiB

Llama Local Runner — Tool Quality Test Results

Highscores

Performance Rankings (Duration)

Test Model Duration (ms) Duration (s)
tool-add default 12126 12.13
tool-multiply default 10678 10.68
tool-weather default 10144 10.14
tool-selection default 15522 15.52

Summary

  • Total Tests: 4
  • Passed: 4
  • Failed: 0
  • Success Rate: 100.00%
  • Average Duration: 12118ms (12.12s)

Failed Tests

No failed tests

Passed Tests

tool-add - default

  • Prompt: Use the add tool to calculate 15 plus 27. Return the result.
  • Expected: 42
  • Actual: The sum of 15 and 27 is 42.
  • Duration: 12126ms (12.13s)
  • Timestamp: 3/19/2026, 6:39:17 PM

tool-multiply - default

  • Prompt: Use the multiply tool to calculate 8 times 9. Return the result.
  • Expected: 72
  • Actual: The result of 8 times 9 is 72.
  • Duration: 10678ms (10.68s)
  • Timestamp: 3/19/2026, 6:39:27 PM

tool-weather - default

  • Prompt: What is the weather like in Berlin? Use the get_weather tool.
  • Expected: berlin
  • Actual: The current temperature in Berlin is 18 degrees Celsius and the weather condition is cloudy.
  • Duration: 10144ms (10.14s)
  • Timestamp: 3/19/2026, 6:39:38 PM

tool-selection - default

  • Prompt: Use the add tool to calculate 100 plus 200. Do not use any other tool.
  • Expected: 300
  • Actual: The sum of 100 and 200 is 300.
  • Duration: 15522ms (15.52s)
  • Timestamp: 3/19/2026, 6:39:53 PM