Authorizations
Body
JSON request to create a parse task
The file to be parsed. Supported inputs:
ch://files/{file_id}: Reference to an existing file. Upload via the Files APIhttp(s)://...: Remote URL to fetchdata:*;base64,...or raw base64 string
Controls the setting for the chunking and post-processing of each chunk.
Controls how errors are handled during processing:
Fail: Stops processing and fails the task when any error occursContinue: Attempts to continue processing despite non-critical errors (eg. LLM refusals etc.)
Fail, Continue Controls the Optical Character Recognition (OCR) strategy.
All: Processes all pages with OCR. (Latency penalty: ~0.5 seconds per page)Auto: Selectively applies OCR only to pages with missing or low-quality text. When text layer is present the bounding boxes from the text layer are used.
All, Auto Azure, Chunkr Configuration for how each document segment is processed and formatted.
Each segment has sensible defaults, but you can override specific settings:
format: Output asHtmlorMarkdownstrategy:Auto(rule-based),LLM(AI-generated), orIgnore(skip)crop_image: Whether to crop images to segment boundsextended_context: Use full page as context for LLM processingdescription: Generate descriptions for segments
Defaults per segment type: Check the documentation for more details.
Only specify the fields you want to change - everything else uses the defaults.
Controls the segmentation strategy:
LayoutAnalysis: Analyzes pages for layout elements (e.g.,Table,Picture,Formula, etc.) using bounding boxes. Provides fine-grained segmentation and better chunking.Page: Treats each page as a single segment. Faster processing, but without layout element detection and only simple chunking.
LayoutAnalysis, Page The number of seconds until task is deleted. Expired tasks can not be updated, polled or accessed via web interface.
The name of the file to be parsed. If not set a name will be generated.
Response
Task created successfully.
True when the task reaches a terminal state i.e. status is Succeeded or Failed or Cancelled
The date and time when the task was created and queued.
Information about the input file.
A message describing the task's status or any errors that occurred.
The status of the task.
Starting, Processing, Succeeded, Failed, Cancelled The unique identifier for the task.
Parse, Extract Version information for the task.
The date and time when the task will expire.
The date and time when the task was finished.
The presigned URL of the input file.
Deprecated use file_info.url instead.
The processed results of a document parsing task
The date and time when the task was started.
The presigned URL of the task.