Open Issues Need Help
View All on GitHubOME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)
OME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)
AI Summary: This task involves creating and updating Kubernetes manifests for Role-Based Access Control (RBAC), ServiceAccounts, and PersistentVolumeClaims (PVCs) to support the OME project's model metadata extraction and storage. It requires creating YAML files for a new ServiceAccount, RBAC roles and bindings, and example manifests demonstrating PVC usage with BaseModels and InferenceServices. Existing controller RBAC will also need updating.
OME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)
AI Summary: This task requires modifying the OME InferenceService controller to support Persistent Volume Claims (PVCs) for model storage. This involves updating the volume creation and mounting logic to handle PVCs instead of host paths, adjusting node selector logic to accommodate PVCs, and adding annotations for debugging and observability. The changes must maintain backward compatibility with existing storage types and include comprehensive unit and integration tests.
OME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)
AI Summary: Create comprehensive documentation for Persistent Volume Claim (PVC) storage support in the OME Kubernetes operator. This includes user guides, API documentation updates, troubleshooting guides, architecture diagrams, and a migration guide. The documentation should cover various PVC access modes, model metadata extraction, and integration with the OME's BaseModel and InferenceService resources. The main README file also needs updating to reflect the new PVC support.
OME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)
AI Summary: The task involves enhancing the OME (Kubernetes operator for LLMs) to support Persistent Volume Claims (PVCs) as a storage mechanism for Large Language Models (LLMs). This requires modifying the model agent to ignore PVCs, updating the BaseModel controller to handle PVC validation, job creation for metadata extraction, and status updates. The InferenceService controller needs adjustments to mount PVCs directly without node selection. The changes must maintain compatibility with existing functionality and include comprehensive testing.
OME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)
AI Summary: Develop a Kubernetes agent within the ome-agent framework to extract metadata from Large Language Model (LLM) configuration files (JSON) mounted via Persistent Volume Claims (PVCs). The agent should update the corresponding BaseModel or ClusterBaseModel Kubernetes Custom Resource (CR) with the extracted metadata, handling various error conditions and using existing utility functions and libraries. Comprehensive unit testing is required.
OME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)
AI Summary: Modify the OME model agent to ignore Persistent Volume Claim (PVC) storage types, delegating their handling to the BaseModel controller. This involves adding logic to detect PVC storage, skipping related processing steps (downloads, status updates, ConfigMap updates, node labeling), and ensuring robust logging. The goal is to simplify the model agent and improve the separation of concerns between components.
OME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)
AI Summary: Implement comprehensive testing for existing PVC storage URI parsing functionality in the OME Kubernetes operator. This involves adding validation tests for edge cases (empty PVC names, missing subpaths, invalid characters, etc.) and ensuring 100% test coverage. The parsing logic itself is already implemented.
OME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)
AI Summary: This task requires creating comprehensive integration tests for a Kubernetes operator (OME) that manages Large Language Models (LLMs), focusing on Persistent Volume Claim (PVC) storage support. This involves writing unit tests for individual components, integration tests for the complete workflow, end-to-end tests for real cluster validation, and performance/stress tests. The tests cover various scenarios, including PVC creation, model data population, metadata extraction, InferenceService creation, and error handling. A test matrix outlines the coverage targets for each component.
OME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)
AI Summary: Create additional Helm charts for OME to simplify the deployment of predefined models, serving runtimes, and inference services. This involves packaging the existing Kustomize configurations into reusable Helm charts, ensuring compatibility with the main OME Helm chart, and updating documentation.
OME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)