Back to all

teacher-data-distillation

by Ryan

00Feb 7, 2026Visit Source
Generate high-quality training data using powerful LLMs (Teacher Models) to train smaller models (Student Models). This is data-centric knowledge distillation - the teacher generates labeled data, not logits. Use this skill when the user needs to: (1) Generate NER/entity annotation data using LLM, (2) Create embedding training pairs (query-positive-negative) with LLM, (3) Generate text classification datasets, (4) Create instruction-tuning data for fine-tuning, (5) Synthesize domain-specific training corpora, (6) Augment existing datasets with LLM, (7) Quality control and filtering of generated data. Supports OpenAI GPT-4, Claude, and local LLMs as teacher models.