segmentation_strategy
for each document. This strategy controls how the document is segmented.
We have two strategies:
LayoutAnalysis
: Run our state-of-the-art layout analysis model to identify the layout elements. This is the default strategy.Page
: Each segment is a page.
When to use each strategy
For most documents, we recommend using theLayoutAnalysis
strategy. This will give you the best results.
Use Page
for:
- Faster processing speed when you need quick results and layout isn’t critical
- Documents with unusual layouts that confuse the layout analysis model
- If the layout is complex but not very information dense,
Page
+ VLM can generate surprisingly good HTML and markdown (see Segment Processing).