7 Open Issues Need Help Last updated: Dec 17, 2025

Open Issues Need Help

View All on GitHub

AI Summary: This issue proposes adding the ability for users to edit and delete their own comments. Editing will be restricted to within one hour of the comment's creation. Deleting a comment will perform a soft delete, replacing the comment's content with a "Deleted" message and timestamp, rather than removing the comment object entirely.

Complexity: 3/5
enhancement good first issue

AI Summary: This issue proposes simplifying the save options on the 'prompt-create' and 'benchmark-run' pages. Currently, users can choose between 'Revealed,' 'Hidden,' or 'Download only' save states, which is deemed too complex for new users. The solution is to only display these advanced options if the user has enabled 'Extras' in their settings; otherwise, a single 'save results' button will be shown.

Complexity: 2/5
enhancement good first issue

AI Summary: This GitHub issue requests setting a predefined list of default test models on the 'prompt-create' and 'benchmark-run' pages. These defaults should only be applied if the benchmark currently has no models configured as default. The specified models include GPT 5.2, Gemini 3 pro preview, and several others.

Complexity: 3/5
enhancement good first issue

AI Summary: The issue proposes automating the process where authors leave positive feedback on their own prompts immediately upon creation. This functionality is requested to help authors easily filter their own prompts from a general list. The solution requires modifying the prompt creation logic to automatically apply this feedback and developing a script to retroactively apply it to all existing prompts.

Complexity: 3/5
enhancement good first issue

AI Summary: This issue aims to improve the analysis of AI model responses by displaying the system prompt that was used for each request. Without this context, it's difficult to fully evaluate a model's output. The proposed solution is to integrate the system prompt into the existing Response component on both prompt-view and review pages.

Complexity: 3/5
enhancement good first issue
enhancement good first issue
bug good first issue