# grepai-ollama-setup

> Install and configure Ollama for local embeddings with GrepAI. Use this skill when setting up private, local embedding generation.

- Author: Yoan Bernabeu
- Repository: yoanbernabeu/grepai-skills
- Version: 20260201121353
- Stars: 9
- Forks: 1
- Last Updated: 2026-02-06
- Source: https://github.com/yoanbernabeu/grepai-skills
- Web: https://mule.run/skillshub/@@yoanbernabeu/grepai-skills~grepai-ollama-setup:20260201121353

---

---
name: grepai-ollama-setup
description: Install and configure Ollama for local embeddings with GrepAI. Use this skill when setting up private, local embedding generation.
---

# Ollama Setup for GrepAI

This skill covers installing and configuring Ollama as the local embedding provider for GrepAI. Ollama enables 100% private code search where your code never leaves your machine.

## When to Use This Skill

- Setting up GrepAI with local, private embeddings
- Installing Ollama for the first time
- Choosing and downloading embedding models
- Troubleshooting Ollama connection issues

## Why Ollama?

| Benefit | Description |
|---------|-------------|
| 🔒 **Privacy** | Code never leaves your machine |
| 💰 **Free** | No API costs |
| ⚡ **Fast** | Local processing, no network latency |
| 🔌 **Offline** | Works without internet |

## Installation

### macOS (Homebrew)

```bash
# Install Ollama
brew install ollama

# Start the Ollama service
ollama serve
```

### macOS (Direct Download)

1. Download from [ollama.com](https://ollama.com)
2. Open the `.dmg` and drag to Applications
3. Launch Ollama from Applications

### Linux

```bash
# One-line installer
curl -fsSL https://ollama.com/install.sh | sh

# Start the service
ollama serve
```

### Windows

1. Download installer from [ollama.com](https://ollama.com/download/windows)
2. Run the installer
3. Ollama starts automatically as a service

## Downloading Embedding Models

GrepAI requires an embedding model to convert code into vectors.

### Recommended Model: nomic-embed-text

```bash
# Download the recommended model (768 dimensions)
ollama pull nomic-embed-text
```

**Specifications:**
- Dimensions: 768
- Size: ~274 MB
- Performance: Excellent for code search
- Language: English-optimized

### Alternative Models

```bash
# Multilingual support (better for non-English code/comments)
ollama pull nomic-embed-text-v2-moe

# Larger, more accurate
ollama pull bge-m3

# Maximum quality
ollama pull mxbai-embed-large
```

| Model | Dimensions | Size | Best For |
|-------|------------|------|----------|
| `nomic-embed-text` | 768 | 274 MB | General code search |
| `nomic-embed-text-v2-moe` | 768 | 500 MB | Multilingual codebases |
| `bge-m3` | 1024 | 1.2 GB | Large codebases |
| `mxbai-embed-large` | 1024 | 670 MB | Maximum accuracy |

## Verifying Installation

### Check Ollama is Running

```bash
# Check if Ollama server is responding
curl http://localhost:11434/api/tags

# Expected output: JSON with available models
```

### List Downloaded Models

```bash
ollama list

# Output:
# NAME                     ID           SIZE    MODIFIED
# nomic-embed-text:latest  abc123...    274 MB  2 hours ago
```

### Test Embedding Generation

```bash
# Quick test (should return embedding vector)
curl http://localhost:11434/api/embeddings -d '{
  "model": "nomic-embed-text",
  "prompt": "function hello() { return world; }"
}'
```

## Configuring GrepAI for Ollama

After installing Ollama, configure GrepAI to use it:

```yaml
# .grepai/config.yaml
embedder:
  provider: ollama
  model: nomic-embed-text
  endpoint: http://localhost:11434
```

This is the **default configuration** when you run `grepai init`, so no changes are needed if using `nomic-embed-text`.

## Running Ollama

### Foreground (Development)

```bash
# Run in current terminal (see logs)
ollama serve
```

### Background (macOS/Linux)

```bash
# Using nohup
nohup ollama serve &

# Or as a systemd service (Linux)
sudo systemctl enable ollama
sudo systemctl start ollama
```

### Check Status

```bash
# Check if running
pgrep -f ollama

# Or test the API
curl -s http://localhost:11434/api/tags | head -1
```

## Resource Considerations

### Memory Usage

Embedding models load into RAM:
- `nomic-embed-text`: ~500 MB RAM
- `bge-m3`: ~1.5 GB RAM
- `mxbai-embed-large`: ~1 GB RAM

### CPU vs GPU

Ollama uses CPU by default. For faster embeddings:
- **macOS:** Uses Metal (Apple Silicon) automatically
- **Linux/Windows:** Install CUDA for NVIDIA GPU support

## Common Issues

❌ **Problem:** `connection refused` to localhost:11434
✅ **Solution:** Start Ollama:
```bash
ollama serve
```

❌ **Problem:** Model not found
✅ **Solution:** Pull the model first:
```bash
ollama pull nomic-embed-text
```

❌ **Problem:** Slow embedding generation
✅ **Solution:**
- Use a smaller model
- Ensure Ollama is using GPU (check `ollama ps`)
- Close other memory-intensive applications

❌ **Problem:** Out of memory
✅ **Solution:** Use a smaller model or increase system RAM

## Best Practices

1. **Start Ollama before GrepAI:** Ensure `ollama serve` is running
2. **Use recommended model:** `nomic-embed-text` offers best balance
3. **Keep Ollama running:** Leave it as a background service
4. **Update periodically:** `ollama pull nomic-embed-text` for updates

## Output Format

After successful setup:

```
✅ Ollama Setup Complete

   Ollama Version: 0.1.x
   Endpoint: http://localhost:11434
   Model: nomic-embed-text (768 dimensions)
   Status: Running

   GrepAI is ready to use with local embeddings.
   Your code will never leave your machine.
```