# pdf-official

> This skill provides efficient methods for PDF manipulation. It prioritizes performance and correct tool selection.

- Author: Gonzalo Blasco
- Repository: gonzoblasco/antigravity-developer-stack
- Version: 20260130105433
- Stars: 0
- Forks: 0
- Last Updated: 2026-02-06
- Source: https://github.com/gonzoblasco/antigravity-developer-stack
- Web: https://mule.run/skillshub/@@gonzoblasco/antigravity-developer-stack~pdf-official:20260130105433

---

---
name: pdf
description: Use when you need to extract text/tables from PDFs, merge/split documents, fill forms, or generate new PDFs. Keywords: pdf, pypdf, pdfplumber, extract text, ocr, merge pdf.
---

# PDF Processing Expert

## Overview

This skill provides efficient methods for PDF manipulation. It prioritizes performance and correct tool selection.

> [!TIP]
> **Performance First**: For simple text extraction or page operations, CLI tools (`pdftotext`, `qpdf`) are 10-50x faster than Python libraries. See [Performance Guide](references/performance.md).

## Quick Start

### 1. Read Text (Best for reliability)

```python
import pdfplumber

with pdfplumber.open("document.pdf") as pdf:
   print(pdf.pages[0].extract_text())
```

### 2. Merge Documents (Best for speed)

```python
from pypdf import PdfWriter

writer = PdfWriter()
writer.append("doc1.pdf")
writer.append("doc2.pdf")
writer.write("merged.pdf")
```

## Common Tasks & Tool Selection

| Goal                    | Recommended Tool                           | Reference                                                                           |
| ----------------------- | ------------------------------------------ | ----------------------------------------------------------------------------------- |
| **Extract Text/Tables** | `pdfplumber` (Python) or `pdftotext` (CLI) | [Library Guide](references/library-guide.md#pdfplumber---text-and-table-extraction) |
| **Merge/Split/Rotate**  | `pypdf` (Python) or `qpdf` (CLI)           | [Library Guide](references/library-guide.md#merge-pdfs)                             |
| **Generate PDFs**       | `reportlab`                                | [Library Guide](references/library-guide.md#reportlab---create-pdfs)                |
| **Fill Forms**          | `pypdf` or `pdf-lib`                       | See `forms.md`                                                                      |
| **OCR Scanned Docs**    | `pytesseract` + `pdf2image`                | [Library Guide](references/library-guide.md#extract-text-from-scanned-pdfs)         |

## Documentation & References

- [**Library Guide**](references/library-guide.md): Detailed code snippets for pypdf, pdfplumber, reportlab.
- [**Performance Guide**](references/performance.md): Optimization tips for large files and low-memory environments.
- [**Forms Guide**](forms.md): Special instructions for handling PDF forms.