# iterabledata-development > Core development workflows, patterns, and conventions for IterableData. Use when implementing features, fixing bugs, or working with the codebase structure. - Author: Ivan Begtin - Repository: apicrafter/pyiterable - Version: 20260128212550 - Stars: 0 - Forks: 0 - Last Updated: 2026-02-07 - Source: https://github.com/apicrafter/pyiterable - Web: https://mule.run/skillshub/@@apicrafter/pyiterable~iterabledata-development:20260128212550 --- --- name: iterabledata-development description: Core development workflows, patterns, and conventions for IterableData. Use when implementing features, fixing bugs, or working with the codebase structure. --- # IterableData Development ## Quick Setup ```bash pip install -e ".[dev]" pytest --verbose ruff check iterable tests ruff format iterable tests ``` ## Code Style - Python 3.10+ with type hints where appropriate - Max line length: 120 characters - Use `ruff` for linting and formatting - Double quotes for strings consistently - Always use context managers for file operations - Import order: standard library, third-party, local imports ## Project Structure - `iterable/helpers/` - Utility functions (detect, schema, utils) - `iterable/datatypes/` - Format-specific implementations - `iterable/codecs/` - Compression codec implementations - `iterable/engines/` - Processing engines (DuckDB, internal) - `iterable/convert/` - Format conversion utilities - `iterable/pipeline/` - Data pipeline processing - `tests/` - Test suite (one test file per format/feature) ## Import Patterns - Main entry: `from iterable.helpers.detect import open_iterable` - Format-specific: `from iterable.datatypes.csv import CSVIterable` - Codecs: `from iterable.codecs.gzipcodec import GZIPCodec` - Always use `open_iterable()` for user-facing code ## File Handling - Always use context managers: `with open_iterable('file.csv') as source:` - Never call `.close()` when using `with` statements - Reset iterators with `.reset()` method when needed - Handle compression automatically via filename detection ## Error Handling - Format detection failures: provide helpful error messages - Missing optional dependencies: raise clear ImportError with installation instructions - Invalid file formats: raise appropriate exceptions (ValueError, TypeError) - Always handle file I/O errors gracefully ## Code Conventions - Use `open_iterable()` for automatic format detection - Prefer bulk operations (`read_bulk`, `write_bulk`) for performance - Use DuckDB engine when appropriate (CSV, JSONL files) - Handle encoding automatically via `chardet` or user specification ## Pre-Commit Checks Before committing: 1. Run tests: `pytest --verbose` 2. Run linter: `ruff check iterable tests` 3. Format code: `ruff format iterable tests` 4. Type check: `mypy iterable` (warnings allowed) ## Quality Tools - Security: `bandit -r iterable -ll` - Dead code: `vulture iterable --min-confidence 80` - Complexity: `radon cc iterable --min B` - Coverage: `pytest --cov=iterable --cov-report=html` ## Known Constraints - DuckDB engine supports: CSV, JSONL, JSON formats and GZIP, ZStandard codecs - Large files: use streaming (iterator interface) to avoid memory issues - XML parsing: requires `iterableargs={'tagname': 'item'}` - Some formats require optional dependencies (see `pyproject.toml`)