2.8 KiB
2.8 KiB
LLM Tools Test Results
Highscores
Performance Rankings (Duration)
| Test | Model | Duration (ms) | Duration (s) |
|---|---|---|---|
| equation_solving | openai/gpt-4o | 3300 | 3.30 |
| file_operations | openai/gpt-4o | 4394 | 4.39 |
| directory_listing | openai/gpt-4o | 8852 | 8.85 |
Summary
- Total Tests: 3
- Passed: 0
- Failed: 3
- Success Rate: 0.00%
- Average Duration: 5515ms (5.52s)
Failed Tests
equation_solving - openai/gpt-4o
- Prompt:
Read the file at C:\Users\zx\Desktop\polymech\polymech-mono\packages\kbot\tests\units\tools.test.md and solve all equations. Return the results in the specified JSON format. - Expected:
[{"equation":"2x + 5 = 13","result":"4"},{"equation":"3y - 7 = 20","result":"9"},{"equation":"4z + 8 = 32","result":"6"}] - Actual:
It seems there was an issue fetching the content from the specified file path. Could you please check if the path is correct and provide any necessary access permissions? - Duration: 3300ms (3.30s)
- Reason: Expected [{"equation":"2x + 5 = 13","result":"4"},{"equation":"3y - 7 = 20","result":"9"},{"equation":"4z + 8 = 32","result":"6"}], but got it seems there was an issue fetching the content from the specified file path. could you please check if the path is correct and provide any necessary access permissions?
- Timestamp: 6/5/2025, 8:46:10 PM
file_operations - openai/gpt-4o
- Prompt:
Write the following data to C:\Users\zx\Desktop\polymech\polymech-mono\packages\kbot\tests\unit\test-data\test-data.json and then read it back: {"test":"data","timestamp":"2025-06-05T18:46:10.027Z"}. Return the read data in JSON format. - Expected:
{"test":"data","timestamp":"2025-06-05T18:46:10.027Z"} - Actual:
{"test":"data","timestamp":"2025-06-05T18:46:10.027Z"} - Duration: 4394ms (4.39s)
- Reason: Expected {"test":"data","timestamp":"2025-06-05T18:46:10.027Z"}, but got {"test":"data","timestamp":"2025-06-05t18:46:10.027z"}
- Timestamp: 6/5/2025, 8:46:14 PM
directory_listing - openai/gpt-4o
- Prompt:
List all files in the directory C:\Users\zx\Desktop\polymech\polymech-mono\packages\kbot\tests\unit\test-data. Return the list as a JSON array of filenames. - Expected:
[] - Actual:
It seems like there's a persistent issue with accessing the directory. Let's ensure the path is correctly accessible or try using a specific pattern if it applies. Would you like to specify a file pattern, or should I continue attempting to access the directory? - Duration: 8852ms (8.85s)
- Reason: Expected [], but got it seems like there's a persistent issue with accessing the directory. let's ensure the path is correctly accessible or try using a specific pattern if it applies. would you like to specify a file pattern, or should i continue attempting to access the directory?
- Timestamp: 6/5/2025, 8:46:23 PM
Passed Tests
No passed tests