mono/packages/kbot/tests/unit/reports/tools.md

2.8 KiB

LLM Tools Test Results

Highscores

Performance Rankings (Duration)

Test Model Duration (ms) Duration (s)
equation_solving openai/gpt-4o 3300 3.30
file_operations openai/gpt-4o 4394 4.39
directory_listing openai/gpt-4o 8852 8.85

Summary

  • Total Tests: 3
  • Passed: 0
  • Failed: 3
  • Success Rate: 0.00%
  • Average Duration: 5515ms (5.52s)

Failed Tests

equation_solving - openai/gpt-4o

  • Prompt: Read the file at C:\Users\zx\Desktop\polymech\polymech-mono\packages\kbot\tests\units\tools.test.md and solve all equations. Return the results in the specified JSON format.
  • Expected: [{"equation":"2x + 5 = 13","result":"4"},{"equation":"3y - 7 = 20","result":"9"},{"equation":"4z + 8 = 32","result":"6"}]
  • Actual: It seems there was an issue fetching the content from the specified file path. Could you please check if the path is correct and provide any necessary access permissions?
  • Duration: 3300ms (3.30s)
  • Reason: Expected [{"equation":"2x + 5 = 13","result":"4"},{"equation":"3y - 7 = 20","result":"9"},{"equation":"4z + 8 = 32","result":"6"}], but got it seems there was an issue fetching the content from the specified file path. could you please check if the path is correct and provide any necessary access permissions?
  • Timestamp: 6/5/2025, 8:46:10 PM

file_operations - openai/gpt-4o

  • Prompt: Write the following data to C:\Users\zx\Desktop\polymech\polymech-mono\packages\kbot\tests\unit\test-data\test-data.json and then read it back: {"test":"data","timestamp":"2025-06-05T18:46:10.027Z"}. Return the read data in JSON format.
  • Expected: {"test":"data","timestamp":"2025-06-05T18:46:10.027Z"}
  • Actual: {"test":"data","timestamp":"2025-06-05T18:46:10.027Z"}
  • Duration: 4394ms (4.39s)
  • Reason: Expected {"test":"data","timestamp":"2025-06-05T18:46:10.027Z"}, but got {"test":"data","timestamp":"2025-06-05t18:46:10.027z"}
  • Timestamp: 6/5/2025, 8:46:14 PM

directory_listing - openai/gpt-4o

  • Prompt: List all files in the directory C:\Users\zx\Desktop\polymech\polymech-mono\packages\kbot\tests\unit\test-data. Return the list as a JSON array of filenames.
  • Expected: []
  • Actual: It seems like there's a persistent issue with accessing the directory. Let's ensure the path is correctly accessible or try using a specific pattern if it applies. Would you like to specify a file pattern, or should I continue attempting to access the directory?
  • Duration: 8852ms (8.85s)
  • Reason: Expected [], but got it seems like there's a persistent issue with accessing the directory. let's ensure the path is correctly accessible or try using a specific pattern if it applies. would you like to specify a file pattern, or should i continue attempting to access the directory?
  • Timestamp: 6/5/2025, 8:46:23 PM

Passed Tests

No passed tests