# cvs-train

> Train consistency view synthesis model for multi-view novel view generation

- Author: Caleb Gross
- Repository: CalebisGross/fresnel
- Version: 20260128234825
- Stars: 2
- Forks: 0
- Last Updated: 2026-02-06
- Source: https://github.com/CalebisGross/fresnel
- Web: https://mule.run/skillshub/@@CalebisGross/fresnel~cvs-train:20260128234825

---

---
name: cvs-train
description: Train consistency view synthesis model for multi-view novel view generation
disable-model-invocation: true
allowed-tools: Bash(source .venv/*, HSA_OVERRIDE_GFX_VERSION=*, python scripts/training/train_cvs.py *)
---

Train CVS (Consistency View Synthesis) model for multi-view consistency.

## Warning

CVS bootstrap data from a poor decoder = garbage in, garbage out.

**Experiment 001 failure**: Training CVS on synthetic data from an undertrained decoder produced garbage. Only use CVS training with:

- Proven decoder checkpoints (SSIM > 0.85)
- Real multi-view data (if available)

## Key Options

| Option | Purpose |
|--------|---------|
| `--use_gaussian_targets` | Use synthetic Gaussian renders as targets |
| `--quality_weighting` | Weight loss by render quality |
| `--progressive_consistency` | Curriculum learning for view consistency |
| `--validate_steps 1,2,4` | Multi-step validation during training |

## Basic Usage

```bash
source .venv/bin/activate && \
HSA_OVERRIDE_GFX_VERSION=11.0.0 python scripts/training/train_cvs.py \
    --data_dir images/training_diverse \
    --output_dir checkpoints/cvs \
    --epochs 50 \
    --batch_size 4
```

## With Quality Weighting

```bash
source .venv/bin/activate && \
HSA_OVERRIDE_GFX_VERSION=11.0.0 python scripts/training/train_cvs.py \
    --data_dir images/training_diverse \
    --output_dir checkpoints/cvs_quality \
    --epochs 50 \
    --batch_size 4 \
    --quality_weighting \
    --progressive_consistency
```

## When to Use CVS

- After you have a working Gaussian decoder
- When you need better novel view consistency
- For multi-view synthesis tasks

## When NOT to Use CVS

- As a first training step (train decoder first)
- With synthetic data from poor decoder
- If you don't need multi-view output