# databricks-mlflow-architect > Design Databricks lakehouse ML pipelines with Unity Catalog medallion layers, Spark ETL, model zoo evaluation, and MLflow tracking/registry. Use for requests to architect, document, or outline notebook-style plans for Databricks ML workflows, experiment tracking, or governance-ready pipelines. - Author: maihao14 - Repository: maihao14/awesome-claude-code-skills - Version: 20260202223545 - Stars: 0 - Forks: 0 - Last Updated: 2026-02-06 - Source: https://github.com/maihao14/awesome-claude-code-skills - Web: https://mule.run/skillshub/@@maihao14/awesome-claude-code-skills~databricks-mlflow-architect:20260202223545 --- --- name: databricks-mlflow-architect description: Design Databricks lakehouse ML pipelines with Unity Catalog medallion layers, Spark ETL, model zoo evaluation, and MLflow tracking/registry. Use for requests to architect, document, or outline notebook-style plans for Databricks ML workflows, experiment tracking, or governance-ready pipelines. --- # Databricks MLflow Architect ## Overview - Build medallion pipeline (Bronze, Silver, Gold) in Unity Catalog. - Use Spark-first ETL and feature engineering. - Train and compare multiple models with consistent evaluation. - Log parameters, metrics, artifacts, and models in MLflow. ## Workflow 1. Gather requirements: source tables, SLA, target variable, compliance, and latency. 2. Design Unity Catalog layout: catalog, schemas, table names, ownership. 3. Define Bronze ingestion: source alignment, incremental vs full loads. 4. Define Silver transformations: cleaning, feature engineering, data quality checks. 5. Define Gold outputs: predictions, evaluation tables, monitoring features. 6. Train model zoo: baseline plus candidate models with consistent splits. 7. Evaluate and select: metrics, cross-validation, business KPIs. 8. Log to MLflow: params, metrics, tags, artifacts, and model registry. 9. Deployment and monitoring: retrain triggers, drift checks, rollback plan. ## Output Format - Provide a Databricks notebook style outline with markdown headings and code blocks. - Use PySpark for data steps and MLflow for tracking. - Include table schemas and key partitioning decisions. ## Safety - Confirm before overwrite, drop, vacuum, or backfill operations. ## Resources - `scripts/generate_notebook_outline.py`: Generate a starter notebook outline when the user needs a quick scaffold. - `references/medallion-checklist.md`: Use for governance, naming, and logging checklists. - `assets/feature-spec-template.md`: Use as a copyable feature specification template. ## Examples **User**: "Design a Bronze/Silver/Gold pipeline with Spark ETL and MLflow logging for a demand forecast." **Output**: - Notebook outline with Unity Catalog layout, Spark transformations, model zoo training loop, and MLflow logging sections.