2.8 KiB
2.8 KiB
LLM Tools Test Results
Highscores
Performance Rankings (Duration)
| Test | Model | Duration (ms) | Duration (s) |
|---|---|---|---|
| equation_solving | openai/gpt-4o | 3925 | 3.92 |
| file_operations | openai/gpt-4o | 4204 | 4.20 |
| directory_listing | openai/gpt-4o | 5277 | 5.28 |
Summary
- Total Tests: 3
- Passed: 0
- Failed: 3
- Success Rate: 0.00%
- Average Duration: 4469ms (4.47s)
Failed Tests
equation_solving - openai/gpt-4o
- Prompt:
Read the file at C:\Users\zx\Desktop\polymech\polymech-mono\packages\kbot\tests\units\tools.test.md and solve all equations. Return the results in the specified JSON format. - Expected:
[{"equation":"2x + 5 = 13","result":"4"},{"equation":"3y - 7 = 20","result":"9"},{"equation":"4z + 8 = 32","result":"6"}] - Actual:
It seems I encountered an unexpected error while trying to read the file. Unfortunately, the exact content of the file isn't available. Is there anything else you would like to try or know? - Duration: 3925ms (3.92s)
- Reason: Expected [{"equation":"2x + 5 = 13","result":"4"},{"equation":"3y - 7 = 20","result":"9"},{"equation":"4z + 8 = 32","result":"6"}], but got it seems i encountered an unexpected error while trying to read the file. unfortunately, the exact content of the file isn't available. is there anything else you would like to try or know?
- Timestamp: 4/18/2025, 9:47:24 AM
file_operations - openai/gpt-4o
- Prompt:
Write the following data to C:\Users\zx\Desktop\polymech\polymech-mono\packages\kbot\tests\unit\test-data\test-data.json and then read it back: {"test":"data","timestamp":"2025-04-18T07:47:24.209Z"}. Return the read data in JSON format. - Expected:
{"test":"data","timestamp":"2025-04-18T07:47:24.209Z"} - Actual:
{\"test\":\"data\",\"timestamp\":\"2025-04-18T07:47:24.209Z\"} - Duration: 4204ms (4.20s)
- Reason: Expected {"test":"data","timestamp":"2025-04-18T07:47:24.209Z"}, but got {"test":"data","timestamp":"2025-04-18t07:47:24.209z"}
- Timestamp: 4/18/2025, 9:47:28 AM
directory_listing - openai/gpt-4o
- Prompt:
List all files in the directory C:\Users\zx\Desktop\polymech\polymech-mono\packages\kbot\tests\unit\test-data. Return the list as a JSON array of filenames. - Expected:
[] - Actual:
It seems there is an issue with accessing the files in the specified directory. Please ensure that the directory path is correct and try again. If there's anything else you'd like to check or update, let me know. - Duration: 5277ms (5.28s)
- Reason: Expected [], but got it seems there is an issue with accessing the files in the specified directory. please ensure that the directory path is correct and try again. if there's anything else you'd like to check or update, let me know.
- Timestamp: 4/18/2025, 9:47:33 AM
Passed Tests
No passed tests