An open-source API Gateway & background daemon designed to queue inference surges and scale cloud GPUs down to zero when idle.

7 stars 1 forks 7 watchers Python Apache License 2.0
api-gateway asyncio devops distributed-systems gpu-orchestration infrastructure kafka redis scale-to-zero vllm
1 Open Issue Need Help Last updated: Jul 1, 2026

Open Issues Need Help

View All on GitHub

An open-source API Gateway & background daemon designed to queue inference surges and scale cloud GPUs down to zero when idle.

Python
#api-gateway#asyncio#devops#distributed-systems#gpu-orchestration#infrastructure#kafka#redis#scale-to-zero#vllm