Start the server and visit
http://localhost:8000/docs for an interactive Swagger UI where you can try every endpoint directly in your browser.Response Format
All chunking endpoints return a list of chunk objects:Chunkers
Token Chunker
POST /v1/chunk/token
Splits text into fixed-size token windows. The fastest and most predictable chunker.
Parameters
Parameters
Text or list of texts to chunk.
Tokenizer to use. Options:
"character", "gpt2", "cl100k_base", or any HuggingFace tokenizer name.Maximum tokens per chunk.
Token overlap between consecutive chunks.
Sentence Chunker
POST /v1/chunk/sentence
Groups sentences into chunks while respecting a token-size limit. Preserves sentence boundaries — no mid-sentence splits.
Parameters
Parameters
Text or list of texts to chunk.
Tokenizer to use.
Maximum tokens per chunk.
Token overlap between chunks.
Minimum sentences to include in each chunk.
Minimum characters required to count as a sentence.
Use approximate token counting for faster processing.
Sentence delimiter(s).
Attach the delimiter to the previous (
"prev") or next ("next") sentence.Recursive Chunker
POST /v1/chunk/recursive
Splits text using a hierarchy of separators defined by a named recipe. Great for structured text like Markdown or code. Chunker instances are cached per (recipe, lang, tokenizer) for speed.
Parameters
Parameters
Text or list of texts to chunk.
Tokenizer to use.
Maximum tokens per chunk.
Named splitting recipe. Options:
"default" (paragraph → sentence → word), "markdown", "python", "js".Language hint for the recipe.
Minimum characters to include in a chunk.
Semantic Chunker
POST /v1/chunk/semantic
Splits where semantic similarity between adjacent sentences drops below a threshold. Produces topically coherent chunks. Requires the semantic extra.
Parameters
Parameters
Text or list of texts to chunk.
Sentence-embedding model for computing similarity. Any model compatible with
sentence-transformers works.Cosine-similarity threshold for splitting (0.0–1.0). Lower values produce larger, fewer chunks.
Maximum tokens per chunk.
Number of surrounding sentences to consider when computing similarity.
Minimum sentences per chunk.
Minimum characters per sentence.
Code Chunker
POST /v1/chunk/code
Splits source code at syntactic boundaries using AST parsing. Never breaks inside a function or class. Requires the code extra.
Parameters
Parameters
Source code or list of source code snippets to chunk.
Tokenizer to use.
Maximum tokens per chunk.
Programming language. Supported:
"python", "javascript", "typescript", "java", "go", "rust", "c", "cpp", and more.Include AST node metadata (node type, line numbers) in the chunk output.
Refineries
Refineries enrich an existing list of chunks. Pass the output of any chunker endpoint directly into a refinery.Overlap Refinery
POST /v1/refine/overlap
Appends or prepends overlapping context from neighbouring chunks. Useful when downstream consumers need continuity across chunk boundaries.
Parameters
Parameters
List of chunk objects from any chunker endpoint. Each must contain
text, start_index, end_index, and token_count.Tokenizer to use.
Size of the overlap context. A float (0–1) is treated as a fraction of the chunk size; an integer is an absolute token count.
Strategy used to create the overlap window.
"suffix" appends context from the previous chunk; "prefix" prepends context from the next chunk.Merge the overlap context into the chunk text field.
Embeddings Refinery
POST /v1/refine/embeddings
Computes and attaches embeddings to each chunk via Chonkie’s AutoEmbeddings. Each chunk in the response gains an embedding field containing a list of floats.
Local models (e.g. minishlab/potion-base-8M) run entirely on-device and require no API key. API-based models require the appropriate environment variable for your provider.
Embeddings Providers
| Type | Example Model | Requirement |
|---|---|---|
| Local (model2vec) | minishlab/potion-base-8M, minishlab/potion-retrieval-32M | None |
| OpenAI | text-embedding-3-small, text-embedding-3-large | OPENAI_API_KEY |
| Cohere | embed-english-v3.0, embed-multilingual-v3.0 | COHERE_API_KEY |
| Voyage AI | voyage-large-2, voyage-code-2 | VOYAGE_API_KEY |
Parameters
Parameters
Batch Processing
Send a list of strings to process multiple documents in one request:Chaining Chunkers and Refineries
Pipeline example — chunk semantically, then add overlap context:Error Handling
| Status | Meaning |
|---|---|
200 | Success |
400 | Invalid request parameters or chunk format |
500 | Internal error (missing extras, model loading failure, etc.) |
