vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization

3 Open Issues Need Help Last updated: Sep 14, 2025

Open Issues Need Help

View All on GitHub
feature request good first issue

vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization

Python

AI Summary: The task is to enhance the vLLM Production Stack Helm chart to support integration with external Prometheus instances for monitoring. This involves adding configuration options to conditionally create ServiceMonitors, allowing users with existing Prometheus setups to avoid deploying a redundant monitoring stack. The implementation requires creating a new template file and modifying the existing Helm chart to conditionally include this template based on user configuration.

Complexity: 3/5
feature request help wanted

vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization

Python

AI Summary: Refactor the CI/CD workflow for the vLLM Production Stack project to improve efficiency and clarity. This involves consolidating redundant jobs, eliminating duplicated steps, using a fake OpenAI server for local backend testing, and reorganizing project files for better modularity. The goal is to reduce CI runtime, improve developer understanding, and facilitate future expansion.

Complexity: 4/5
feature request good first issue help wanted

vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization

Python