# vector-index-tuning > Optimize vector index performance for latency, recall, and memory. Use when tuning HNSW parameters, selecting quantization strategies, or scaling vector search infrastructure. - Author: msyc - Repository: xfstudio/skills - Version: 20260204212605 - Stars: 1 - Forks: 0 - Last Updated: 2026-02-08 - Source: https://github.com/xfstudio/skills - Web: https://mule.run/skillshub/@@xfstudio/skills~vector-index-tuning:20260204212605 --- --- name: vector-index-tuning description: Optimize vector index performance for latency, recall, and memory. Use when tuning HNSW parameters, selecting quantization strategies, or scaling vector search infrastructure. --- # Vector Index Tuning Guide to optimizing vector indexes for production performance. ## Use this skill when - Tuning HNSW parameters - Implementing quantization - Optimizing memory usage - Reducing search latency - Balancing recall vs speed - Scaling to billions of vectors ## Do not use this skill when - You only need exact search on small datasets (use a flat index) - You lack workload metrics or ground truth to validate recall - You need end-to-end retrieval system design beyond index tuning ## Instructions 1. Gather workload targets (latency, recall, QPS), data size, and memory budget. 2. Choose an index type and establish a baseline with default parameters. 3. Benchmark parameter sweeps using real queries and track recall, latency, and memory. 4. Validate changes on a staging dataset before rolling out to production. Refer to `resources/implementation-playbook.md` for detailed patterns, checklists, and templates. ## Safety - Avoid reindexing in production without a rollback plan. - Validate changes under realistic load before applying globally. - Track recall regressions and revert if quality drops. ## Resources - `resources/implementation-playbook.md` for detailed patterns, checklists, and templates.