Open Issues Need Help
View All on GitHub bug good first issue
Frontier LLMs collapse past ~22 reasoning steps on state-space search, but tool delegation stays near-perfect. Benchmark, theory and code behind the Deterministic Horizon (ICML 2026).
Python
#ai-agents#benchmark#chain-of-thought#evaluation#large-language-models#llm#machine-learning#reasoning#tool-use
[docs] Record a 30-second GIF of the offline demo for the README about 3 hours ago
documentation good first issue
Frontier LLMs collapse past ~22 reasoning steps on state-space search, but tool delegation stays near-perfect. Benchmark, theory and code behind the Deterministic Horizon (ICML 2026).
Python
#ai-agents#benchmark#chain-of-thought#evaluation#large-language-models#llm#machine-learning#reasoning#tool-use