# unknown > | Field | Type | Description | |-------------------|-------------|-------------| | `markdown` | string | Complete document as markdown. Uses `` tags as anchors. | | `chunks` | list[dict] | An array of individual content blocks. Each chunk includes an `id`, its own `markdown`, a `type` and a `grounding` object. | | `splits` | list[dict] | Populated if `split="page"` was used. Each object includes `class` (e.g., "page"), `identifier` (e.g., "page_0"), pages (e.g. , [0]), `m... - Author: avaxia888 - Repository: avaxia8/ade_claude_skills - Version: 20260205191958 - Stars: 0 - Forks: 0 - Last Updated: 2026-02-06 - Source: https://github.com/avaxia8/ade_claude_skills - Web: https://mule.run/skillshub/@@avaxia8/ade_claude_skills~unknown:20260205191958 --- # LandingAI ADE Python Claude Skill ## 1. SETUP ```python # Install pip install landingai-ade # Import required modules from pathlib import Path from landingai_ade import LandingAIADE # Initialize client client = LandingAIADE() file_path = Path("my_document.pdf") ``` ## 2. PARSE ```python # Parse a document parsed_response = client.parse( document=file_path, split="page", # optional save_to="./output_folder" # optional: saves as {input_file}_parse_output.json ) # View the markdown output print(parsed_response.markdown) # The data will be returned as key-value pairs separated by line breaks (\n). ``` ## 3. SPLIT: Separate Mixed Docs ```python # Split documents by type response = client.split( markdown=parsed_response, split_class=[ {"name": "Invoice", "identifier": "ID"}, {"name": "Receipt", "identifier": "Date"} ] ) ``` ## 4. EXTRACT ```python from pydantic import BaseModel import json # Define your schema class InvoiceSchema(BaseModel): invoice_number: str total: float # Get JSON schema schema = InvoiceSchema.model_json_schema() # Extract structured data client.extract( markdown=parsed_response, schema=schema, save_to="./output_folder" # optional: saves as {input_file}_extract_output.json ) ``` ## 5. Quicker Processing (Async) ```python import asyncio from landingai_ade import AsyncLandingAIADE # Use async client for better performance async with AsyncLandingAIADE() as async_client: response = await async_client.parse(document=file_path) print(response.markdown) ``` ## 6. Heavy Lifting (Parse Jobs) ```python # 1. Create the job job = client.parse_jobs.create( document=file_path, model="dpt-2-latest" ) # 2. Check Status job_status = client.parse_jobs.get(job.job_id) print(f"Status: {job.job_status}") # 3. List All Jobs all_jobs = client.parse_jobs.list(status="completed") ``` ## 7. Key Response Structures ### Parse Response | Field | Type | Description | |-------------------|-------------|-------------| | `markdown` | string | Complete document as markdown. Uses `` tags as anchors. | | `chunks` | list[dict] | An array of individual content blocks. Each chunk includes an `id`, its own `markdown`, a `type` and a `grounding` object. | | `splits` | list[dict] | Populated if `split="page"` was used. Each object includes `class` (e.g., "page"), `identifier` (e.g., "page_0"), pages (e.g. , [0]), `markdown` and `chunks`(a list of chunk IDs belonging to that split). | | `grounding` | dict | Maps chunk IDs to detailed grounding information, including page number, bounding box coordinates, and a detailed classification of elements (e.g., `chunkText`, `tableCell`). Grounding enables precise mapping of content back to its location in the original document. | | `metadata` | dict | Processing metadata including `filename`, organization ID (`org_id`), `page_count`, processing time (`duration_ms`), API credit usage (`credit_usage`) and model `version`. There will be a `failed_pages` array listing page numbers that failed to process. | ```python # Each chunk has: chunk[].id # Unique identifier chunk[].type # Chunk type (text, table, figure, etc.) chunk[].markdown # Chunk Content chunk[].grounding.page # Page number (0-indexed) chunk[].grounding.box # Bounding box (left, top, right, bottom) normalized 0-1 # Grounding dict has: grounding.{id}.page # Page number grounding.{id}.type # Grounding type (chunkText, chunkTable, etc.) grounding.{id}.box # Bounding box coordinates # Grounding object keys can be: # - Chunk IDs: UUID format (e.g., ea3ffbfc-dce9-4d90-a259-2df8fe7f067d) # - Table IDs: Format {page}-{id} (e.g., 0-1, 0-m) # - Table cell IDs: Format {page}-{id} (e.g., 0-2, 0-n) # Table cells have additional position field: grounding.{cell_id}.position = { row: 0, # Zero-indexed row col: 0, # Zero-indexed column rowspan: 1, # Number of rows spanned colspan: 2, # Number of columns spanned chunk_id: "..." # Associated chunk ID } ``` ### Extract Response ```python response.extraction # Extracted key-value pairs response.extraction_metadata # Contains "references" linking to chunk IDs response.metadata # Processing metadata ``` ### Split Response ```python response.splits # Array of classified splits split.classification # Assigned class name split.identifier # Grouping identifier split.markdowns # Array of markdown content split.pages # Page numbers ``` # Chunk Types vs Grounding Types in ADE ## Two Type Systems **Chunks** (`chunks[].type`) - Document structure: - `text`, `table`, `figure`, `marginalia`, `logo`, `card`, `attestation`, `scan_code` **Groundings** (`grounding.{id}.type`) - Precise elements with coordinates: - `chunkText`, `chunkTable`, `chunkFigure`, `chunkMarginalia`, `chunkLogo`, `chunkCard`, `chunkAttestation`, `chunkScanCode`, `chunkForm` - Plus exclusive types: `table`, `tableCell` ## Key Relationship **Prefix Rule:** Remove "chunk" to map grounding → chunk type - `chunkText` → `text` chunk - `chunkTable` → `table` chunk **Important Distinction:** - Chunk `table` = "region with tabular content" (1 chunk) - Grounding `table` = "actual table structure" (possibly multiple per chunk) - Grounding `tableCell` = individual cells (only at grounding level) ## When to Use **Use chunk types for:** - Document structure analysis - Content type counting - High-level overview **Use grounding types for:** - Bounding box coordinates - Cell-level extraction - Precise element positioning ## Common Confusion Users ask why both exist: **Answer** → Different granularity. Chunks = overview, groundings = precise locations with coordinates. **Real Example:** One table chunk with 2 tables, each with 2 cells: ```python # In chunks array - ONE chunk chunk = { "id": "e59fb76c-9fad-4cfa", "type": "table", "markdown": "