Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.chunkr.ai/llms.txt

Use this file to discover all available pages before exploring further.

Different applications have different needs. Chunkr AI API is designed to be flexible and customizable to meet your specific requirements. We support the following configuration options:
  • chunk_processing: Controls the setting for the chunking and post-processing of each chunk.
  • expires_in: The number of seconds until task is deleted.
  • ocr_strategy: Controls the Optical Character Recognition (OCR) strategy.
  • pipeline: Options for layout analysis and OCR providers.
  • segment_processing: Controls the post-processing of each segment type. Allows you to generate HTML, markdown and run custom VLM prompts.
  • segmentation_strategy: Controls the segmentation strategy
The configuration options can be combined to create a customized processing pipeline. When a Task is created, the configuration is done through the Configuration object. Here is an example of how to configure the API to run a custom VLM prompt on each picture in a document:
from chunkr_ai import Chunkr
from chunkr_ai.models import (
    Configuration, 
    GenerationConfig, 
    GenerationStrategy, 
    SegmentProcessing, 
    SegmentationStrategy,
    SegmentFormat
)

chunkr = Chunkr()

chunkr.upload("path/to/file", Configuration(
    segment_processing=SegmentProcessing(
        picture=GenerationConfig(
            format=SegmentFormat.MARKDOWN,
            strategy=GenerationStrategy.LLM,
            llm="Does this picture have a cat in it? Answer must be true or false."
        )
    ),
))