Documentation Index
Fetch the complete documentation index at: https://docs.chunkr.ai/llms.txt
Use this file to discover all available pages before exploring further.
Supported segment types for Excel files
The Excel parser maps identified elements to Chunkr’s existing segment types, ensuring seamless integration with your current workflow. The following segment types are supported:| Segment Type | Description |
|---|---|
| Table | Rectangular data structures with headers and rows |
| SectionHeader | Headers that divide content into sections |
| Title | Main document or sheet titles |
| Text | Regular text content |
| Picture | Charts, graphs, images, logos |
| Footnote | References or additional information at bottom |
| PageHeader | Content at the top of each page |
| PageFooter | Content at the bottom of each page |
Excel Segment Output Structure
When processing Excel files, Chunkr returns segments with both standard properties (common to all document types) and Excel-specific metadata. Understanding this structure is crucial for working with Excel parsing results effectively.Standard Segment Properties
Each Excel segment contains the core properties found in all Chunkr outputs:Excel-Specific Properties: The ss_ Keys
The key difference between Excel outputs and regular document outputs are the ss_ (spreadsheet-specific) keys. These properties provide essential Excel context that doesn’t exist in PDF or other document formats:
| Property | Type | Description |
|---|---|---|
ss_sheet_name | String | Name of the worksheet containing this segment |
ss_range | String | Excel cell range (e.g., “A1:D10”) |
ss_cells | Array | Array of Cell objects with detailed cell data |
ss_header_range | String | Range of the header cells (if applicable) |
ss_header_text | String | Text content of the header (if applicable) |
ss_header_bbox | Object | Bounding box of the header (if applicable) |
ss_header_ocr | Array | OCR results for the header (if applicable) |
Why Excel Needs Special Keys
Thess_ fields exist to work with native Excel values and maintain the mapping between identified content and the original Excel structure. These fields only exist for spreadsheet files.
The ss_ keys hold state for what was identified and map it back to Excel cells and ranges, enabling you to:
- Map to original Excel cells: Access the exact cells (
ss_cells) that were identified as part of each segment - Preserve Excel ranges: Know the exact cell range (
ss_range) each segment corresponds to - Track sheet context: Identify which worksheet (
ss_sheet_name) the content came from - Maintain Excel structure: Work with native Excel data like formulas, cell values, and styling
Complete Excel Segment Example
Here’s what a complete Excel Table segment looks like:Excel Chart Example
Charts and graphs in Excel are identified asPicture segments with spreadsheet context: