Open Issues Need Help
View All on GitHub Refactor: Improve Error Handling and Exception Logging about 2 months ago
enhancement good first issue
Different post-training techniques for LLMs, including: SFT, DPO and Online RL
Python
#alignment#dpo#fine-tuning#huggingface#huggingface-transformers#llm#pytorch#reinforcement-learning#sft#trl
Issue 1: Configuration Management about 2 months ago
enhancement good first issue
Different post-training techniques for LLMs, including: SFT, DPO and Online RL
Python
#alignment#dpo#fine-tuning#huggingface#huggingface-transformers#llm#pytorch#reinforcement-learning#sft#trl