tokenbender/infinite

a rubric driven prioritized replay rl algo to maximise continual learning

11 stars 3 forks 11 watchers Python Apache License 2.0

1 Open Issue Need Help Last updated: Sep 3, 2025

Open Issues Need Help

[Scaffolding] Stubs and randomized rewards to unblock E2E runs 10 months ago

AI Summary: This GitHub issue proposes implementing various 'stub' (mock) components and randomized reward pathways to unblock end-to-end training runs before all real evaluation adapters, rubric graders, and environments are fully developed. The objective is to enable a fully runnable training loop, validate the single-signal scheduler and logging, and ensure a seamless transition from stubs to real components without significant code changes.

Complexity: 3/5

enhancement help wanted priority: P0

tokenbender/infinite

a rubric driven prioritized replay rl algo to maximise continual learning

Python