ai-twinkle/Eval

Twinkle Eval：高效且準確的 AI 評測工具

69 stars 0 forks 0 watchers Python

eval evaluation llm

View on GitHub

2 Open Issues Need Help Last updated: Aug 6, 2025

Open Issues Need Help

View All on GitHub

AI/ML • AI Model Evaluation

Error when using API key from NVIDIA GPT-OSS-20B example with Twinkle Eval endpoint 6 months ago

AI Summary: Debug a Twinkle Eval integration issue where using an API key generated from the NVIDIA GPT-OSS-20B example results in a 'NoneType' object is not subscriptable error. This involves analyzing the response format differences between the NVIDIA endpoint and Twinkle Eval's expected format, and potentially modifying Twinkle Eval's code to handle the NVIDIA response correctly.

Complexity: 4/5

bug help wanted

ai-twinkle/Eval

Twinkle Eval：高效且準確的 AI 評測工具

Python

#eval#evaluation#llm

AI/ML • AI Model Evaluation

Incorrect parameter name: should use reasoning instead of reasoning_content in Ollama 6 months ago

AI Summary: The task is to modify the Twinkle Eval tool to correctly handle the `reasoning` parameter when interacting with the Ollama LLM. Currently, it uses `reasoning_content`, which is incorrect for Ollama. A check should be added to differentiate between Ollama and other LLMs (like vLLM) to use the appropriate parameter name.

Complexity: 4/5

bug help wanted

ai-twinkle/Eval

Twinkle Eval：高效且準確的 AI 評測工具

Python

#eval#evaluation#llm