Create large-scale synthetic training data for model distillation and evaluation

444 stars 32 forks 444 watchers Python Apache License 2.0
agents ai data-science dataset distillation evaluation fine-tuning huggingface huggingface-datasets machine-learning synthetic-data synthetic-dataset-generation
2 Open Issues Need Help Last updated: Sep 14, 2025

Open Issues Need Help

View All on GitHub
Push to Kaggle Support about 2 hours ago
good first issue

Create large-scale synthetic training data for model distillation and evaluation

Python
#agents#ai#data-science#dataset#distillation#evaluation#fine-tuning#huggingface#huggingface-datasets#machine-learning#synthetic-data#synthetic-dataset-generation
good first issue

Create large-scale synthetic training data for model distillation and evaluation

Python
#agents#ai#data-science#dataset#distillation#evaluation#fine-tuning#huggingface#huggingface-datasets#machine-learning#synthetic-data#synthetic-dataset-generation