# crt-data-extractor > Automates the extraction, harmonization, and validation of Manure Management System (MMS) data from UNFCCC Common Reporting Tables (CRTs). It specifically targets Table 3.B(a) to extract livestock population, allocation fractions, and Methane Conversion Factors (MCF) for the GLEAM model. - Author: yushanli-33 - Repository: yushanli-33/Data-extraction - Version: 20260126142235 - Stars: 0 - Forks: 0 - Last Updated: 2026-02-06 - Source: https://github.com/yushanli-33/Data-extraction - Web: https://mule.run/skillshub/@@yushanli-33/Data-extraction~crt-data-extractor:20260126142235 --- --- name: crt-data-extractor description: Automates the extraction, harmonization, and validation of Manure Management System (MMS) data from UNFCCC Common Reporting Tables (CRTs). It specifically targets Table 3.B(a) to extract livestock population, allocation fractions, and Methane Conversion Factors (MCF) for the GLEAM model. --- # CRT Data Extractor This skill assists in processing National Inventory Documents (NIDs) and Common Reporting Tables (CRTs) to extract data for the Global Livestock Environmental Assessment Model (GLEAM). ## workflow ### 1. Document Identification & Ingestion The user provides CRT files (typically Excel) or NIDs (PDFs). * [cite_start]**Target File Pattern:** `[ISO3]-CRT-[YEAR]-V[VER]-....xlsx` (e.g., `KOR-CRT-2024-V0.24...`)[cite: 18]. * [cite_start]**Primary Target Table:** **Table 3.B(a)** "Sectoral Background Data for Agriculture - Manure Management"[cite: 5]. * [cite_start]**Secondary Target:** If CRT is unavailable, search NID text for "Manure Management" or "MMS" keywords[cite: 3]. ### 2. Extraction Strategy Use the bundled script `scripts/parse_crt_excel.py` for deterministic extraction from Excel files. If extracting from PDF/Image: 1. [cite_start]**Locate Table 3.B(a)**: Look for headers "Allocation by climate region" and "Methane conversion factors"[cite: 8]. 2. [cite_start]**Identify Livestock Categories**: Extract rows for Dairy cattle, Non-dairy cattle, Sheep, Swine, Buffalo, and Poultry. 3. **Extract Columns**: * **Population**: Animal counts. * [cite_start]**Allocation (%)**: Fraction of manure managed in specific systems (Lagoon, Liquid/Slurry, Solid Storage, etc.) split by climate (Cool, Temperate, Warm)[cite: 8]. * [cite_start]**MCF**: Methane Conversion Factors associated with each system/climate[cite: 8]. ### 3. Data Harmonization Raw extraction must be mapped to the IPCC-aligned MMS typology. * **Reference**: Load `references/mms_typology_map.md` to map local terms (e.g., "Deep Bedding") to standard categories (e.g., "Solid Storage"). * [cite_start]**Logic**: If a category is ambiguous, flag it for "Human-in-the-loop" validation[cite: 94]. ### 4. Validation Rules [cite_start]Run the following checks on extracted data[cite: 95]: * **Sum Check**: Allocation fractions for a single species/climate zone must sum to 100% (Allow tolerance ±2%). * **Consistency**: MCF values should be within the standard IPCC range (0–100). * [cite_start]**Completeness**: Ensure data exists for all reported years to enable trend analysis[cite: 4, 37]. ### 5. Output Formatting [cite_start]Format the final output according to `references/gleam_mms_schema.md` for SQL database compatibility[cite: 73]. --- *See `scripts/parse_crt_excel.py` for the extraction logic and `references/gleam_mms_schema.md` for the output definitions.*