# ncu-cli > Reference for NVIDIA Nsight Compute (ncu) CLI flags and common usage patterns. Use when running NCU commands to profile, export, or query CUDA kernels. Consult this before constructing any ncu command to avoid flag mistakes. - Author: Sriram Bhargav Karnati - Repository: deepsimulation/agent-docs-public - Version: 20260209161850 - Stars: 0 - Forks: 0 - Last Updated: 2026-02-10 - Source: https://github.com/deepsimulation/agent-docs-public - Web: https://mule.run/skillshub/@@deepsimulation/agent-docs-public~ncu-cli:20260209161850 --- --- name: ncu-cli description: Reference for NVIDIA Nsight Compute (ncu) CLI flags and common usage patterns. Use when running NCU commands to profile, export, or query CUDA kernels. Consult this before constructing any ncu command to avoid flag mistakes. --- # NCU CLI Reference **This skill is the authoritative reference for `ncu` CLI flags.** Always consult it before constructing NCU commands. If you encounter an NCU CLI error (unrecognized option, wrong flag, etc.), fix the command AND update this skill via the `/edit-agent-docs` process so the mistake is never repeated. ## Self-improvement protocol If an `ncu` command fails due to a wrong flag or syntax: 1. Fix the command using `ncu --help` or the error message 2. Update this skill file with the correction 3. Commit and push via `/edit-agent-docs` ## Related skills - [fetch-ncu-metrics](../fetch-ncu-metrics/SKILL.md) — Common metrics of interest, metric name reference, and how to record results in progress.md ## Common commands ### Profile a kernel ```bash ncu --set full --import-source yes \ --kernel-name regex:"" \ --launch-skip --launch-count \ --force-overwrite --export .ncu-rep \ [args...] ``` Key flags: - `--set full` — collect all metric sections (most comprehensive) - `--set basic` — collect basic metrics only (faster, default if `--set` omitted) - `--import-source yes` — embed CUDA source in profile for source-level metrics - `--kernel-name regex:"pattern"` or `-k regex:"pattern"` — filter to specific kernel(s) - `--launch-skip N` or `-s N` — skip first N kernel launches (warmup) - `--launch-count N` or `-c N` — profile only N kernel launches - `--force-overwrite` or `-f` — overwrite existing output file - `--export ` or `-o ` — save profile to .ncu-rep file ### Export profile to text ```bash # Detailed metrics (recommended for analysis) ncu --import .ncu-rep --page details > .txt # Raw metrics (all available, very verbose) ncu --import .ncu-rep --page raw > _raw.txt ``` Key flags: - `--import ` or `-i ` — read from existing .ncu-rep file - `--page details` — human-readable metrics with analysis (default) - `--page raw` — all raw metric values ### Query specific metrics from a profile ```bash ncu --import .ncu-rep \ --metrics ,,... \ --page raw ``` ### List available kernels in a profile ```bash ncu --import .ncu-rep --print-summary per-kernel --page raw ``` ### List kernel names during profiling Use `--print-kernel-base demangled` to see readable kernel names: ```bash ncu --print-kernel-base demangled \ --launch-skip --launch-count \ [args...] ``` ## Flags that DON'T exist (common mistakes) These are flags that look plausible but **do not exist** in NCU: | Wrong flag | What to use instead | |---|---| | `--list-kernels` | `--print-summary per-kernel` (for imported profiles) or just run with `--print-kernel-base demangled` | ## Flag reference (verified) ### Profiling control | Flag | Short | Description | |---|---|---| | `--set ` | | Section set: `full`, `basic`, `detailed`, etc. | | `--section ` | | Collect specific section | | `--metrics ` | | Comma-separated metric names | | `--kernel-name ` | `-k` | Filter by kernel name (supports `regex:`) | | `--kernel-name-base ` | | `function` (default), `demangled`, `mangled` | | `--launch-skip ` | `-s` | Skip first N launches | | `--launch-count ` | `-c` | Profile only N launches | | `--import-source yes` | | Embed CUDA source in profile | ### Output | Flag | Short | Description | |---|---|---| | `--export ` | `-o` | Save profile to .ncu-rep | | `--force-overwrite` | `-f` | Overwrite existing output | | `--import ` | `-i` | Read existing .ncu-rep | | `--page ` | | Output page: `details`, `raw`, `source` | | `--print-kernel-base ` | | Kernel name format: `demangled`, `function`, `mangled` | | `--print-summary ` | | Summary mode: `none`, `per-kernel`, `per-gpu` | | `--print-details ` | | Detail level: `header`, `all` | | `--csv` | | Comma-separated output | ### Filtering | Flag | Short | Description | |---|---|---| | `--kernel-id ` | | Filter by kernel ID | | `--target-processes ` | | `all` or `application-only` | ### Misc | Flag | Short | Description | |---|---|---| | `--help` | `-h` | Print help | | `--version` | `-v` | Print version | | `--query-metrics` | | List available metrics for device | | `--list-sets` | | List available section sets | | `--list-sections` | | List available sections | | `--list-chips` | | List supported GPU chips |