Quick Summary: Most configuration options work identically to other file types. OCR, and pipeline settings are ignored since Excel files use native processing.
Configuration Options Overview
Excel configuration options fall into two categories:Category | Options | Behavior |
---|---|---|
Work Normally | Segmentation, Segment Processing, Chunking, LLM Processing, Error Handling, Expiration | Same as other file types with minor Excel-specific notes |
Ignored | OCR Strategy, Pipeline Provider | No effect on Excel processing |
Options That Work Normally
These configuration options work the same as other file types, with some Excel-specific behavior noted below.Segmentation Strategy
Controls how Excel sheets are analyzed and segmented.LayoutAnalysis
(Recommended): Runs Excel layout analysis to identify tables, charts, and text regionsPage
: Outputs each full Excel sheet as a singleTable
segment
Segment Processing
Configure how different segment types are processed and formatted.- Tables: The
strategy
field (Auto/LLM) is ignored - tables are always extracted natively from Excel - All Other Segments: Picture, Text, Title, etc. work exactly as with other file types
Chunk Processing
Controls how content is divided into chunks for RAG applications.- Works the same as other file types
- Important: Chunks will break on new sheets (unlike PDFs that chunk across pages)
- Each Excel worksheet is treated as a boundary for chunking
LLM Processing
Configure custom models and prompts for content generation.- Works exactly the same as other file types
- Affects segment processing only
- Can be combined with segment-specific LLM prompts
Error Handling Strategy
Controls how processing errors are handled.Fail
: Stop processing on any errorContinue
: Continue processing despite non-critical errors
Expiration Time
Sets how long task results are retained before deletion.Options That Are Ignored
These configuration options have no effect when processing Excel files because Excel uses native processing methods.OCR Strategy
Ignored for Excel files - Excel files contain native text data, so OCR is never applied regardless of this setting.
All
, Auto
) are ignored since Excel files provide native text extraction.
Pipeline Provider (Azure Feature)
Ignored for Excel files - Excel files always use Chunkr’s native processing pipeline.