Quantize transformers to any learned arbitrary 4-bit numeric format

1 Open Issue Need Help Last updated: Jul 5, 2025

Open Issues Need Help

View All on GitHub

AI Summary: The task is to create `Nf4Linear` modules, mirroring the existing `Any4Linear` modules. This involves implementing the necessary functionalities for NormalFloat4 (NF4) quantization within a linear layer, likely requiring adaptation of existing code and potentially optimization for performance.

Complexity: 4/5
enhancement good first issue

Quantize transformers to any learned arbitrary 4-bit numeric format

Python