Open Issues Need Help
View All on GitHub Binary install not working in airgapped/pip proxy scenario about 2 months ago
help wanted
LLM model quantization (compression) toolkit with hw acceleration support for Nvidia CUDA, AMD ROCm, Intel XPU and Intel/AMD/Apple CPU via HF, vLLM, and SGLang.
Python
#gptq#optimum#peft#quantization#sglang#transformers#vllm