3.4 KiB
3.4 KiB
Basic Operations Test Results
Highscores
Performance Rankings (Duration)
| Test | Model | Duration (ms) | Duration (s) |
|---|---|---|---|
| addition | openrouter/quasar-alpha | 811 | 0.81 |
| addition | openai/gpt-4o-mini | 842 | 0.84 |
| addition | openai/gpt-3.5-turbo | 1505 | 1.50 |
| addition | deepseek/deepseek-r1-distill-qwen-14b:free | 3470 | 3.47 |
| multiplication | openrouter/quasar-alpha | 780 | 0.78 |
| multiplication | openai/gpt-3.5-turbo | 881 | 0.88 |
| multiplication | openai/gpt-4o-mini | 1096 | 1.10 |
| multiplication | deepseek/deepseek-r1-distill-qwen-14b:free | 1327 | 1.33 |
| division | openrouter/quasar-alpha | 731 | 0.73 |
| division | openai/gpt-3.5-turbo | 784 | 0.78 |
| division | openai/gpt-4o-mini | 975 | 0.97 |
| division | deepseek/deepseek-r1-distill-qwen-14b:free | 4467 | 4.47 |
Summary
- Total Tests: 12
- Passed: 11
- Failed: 1
- Success Rate: 91.67%
- Average Duration: 1472ms (1.47s)
Failed Tests
multiplication - deepseek/deepseek-r1-distill-qwen-14b:free
- Prompt:
multiply 8 and 3. Return only the number, no explanation. - Expected:
24 - Actual: ``
- Duration: 1327ms (1.33s)
- Reason: Model returned empty response
- Timestamp: 4/4/2025, 2:37:03 PM
Passed Tests
addition - openai/gpt-3.5-turbo
- Prompt:
add 5 and 3. Return only the number, no explanation. - Expected:
8 - Actual:
8 - Duration: 1505ms (1.50s)
- Timestamp: 4/4/2025, 2:36:55 PM
addition - deepseek/deepseek-r1-distill-qwen-14b:free
- Prompt:
add 5 and 3. Return only the number, no explanation. - Expected:
8 - Actual:
8 - Duration: 3470ms (3.47s)
- Timestamp: 4/4/2025, 2:36:59 PM
addition - openai/gpt-4o-mini
- Prompt:
add 5 and 3. Return only the number, no explanation. - Expected:
8 - Actual:
8 - Duration: 842ms (0.84s)
- Timestamp: 4/4/2025, 2:37:00 PM
addition - openrouter/quasar-alpha
- Prompt:
add 5 and 3. Return only the number, no explanation. - Expected:
8 - Actual:
8 - Duration: 811ms (0.81s)
- Timestamp: 4/4/2025, 2:37:00 PM
multiplication - openai/gpt-3.5-turbo
- Prompt:
multiply 8 and 3. Return only the number, no explanation. - Expected:
24 - Actual:
24 - Duration: 881ms (0.88s)
- Timestamp: 4/4/2025, 2:37:01 PM
multiplication - openai/gpt-4o-mini
- Prompt:
multiply 8 and 3. Return only the number, no explanation. - Expected:
24 - Actual:
24 - Duration: 1096ms (1.10s)
- Timestamp: 4/4/2025, 2:37:04 PM
multiplication - openrouter/quasar-alpha
- Prompt:
multiply 8 and 3. Return only the number, no explanation. - Expected:
24 - Actual:
24 - Duration: 780ms (0.78s)
- Timestamp: 4/4/2025, 2:37:05 PM
division - openai/gpt-3.5-turbo
- Prompt:
divide 15 by 3. Return only the number, no explanation. - Expected:
5 - Actual:
5 - Duration: 784ms (0.78s)
- Timestamp: 4/4/2025, 2:37:05 PM
division - deepseek/deepseek-r1-distill-qwen-14b:free
- Prompt:
divide 15 by 3. Return only the number, no explanation. - Expected:
5 - Actual:
5 - Duration: 4467ms (4.47s)
- Timestamp: 4/4/2025, 2:37:10 PM
division - openai/gpt-4o-mini
- Prompt:
divide 15 by 3. Return only the number, no explanation. - Expected:
5 - Actual:
5 - Duration: 975ms (0.97s)
- Timestamp: 4/4/2025, 2:37:11 PM
division - openrouter/quasar-alpha
- Prompt:
divide 15 by 3. Return only the number, no explanation. - Expected:
5 - Actual:
5 - Duration: 731ms (0.73s)
- Timestamp: 4/4/2025, 2:37:11 PM