Layout Analysis
Segmentation Strategy
Controls the segmentation strategy
The chunkr AI API allows you to specify a segmentation_strategy
for each document. This strategy controls how the document is segmented.
We have two strategies:
LayoutAnalysis
: Run our state-of-the-art layout analysis model to identify the layout elements. This is the default strategy.Page
: Each segment is a page.
This is how you can configure the segmentation strategy:
When to use each strategy
For most documents, we recommend using the LayoutAnalysis
strategy. This will give you the best results.
Use Page
for:
- Faster processing speed when you need quick results and layout isn’t critical
- Documents with unusual layouts that confuse the layout analysis model
- If the layout is complex but not very information dense,
Page
+ VLM can generate surprisingly good HTML and markdown (see Segment Processing).