Chonkie Documentation

Start the server and visit http://localhost:8000/docs for an interactive Swagger UI where you can try every endpoint directly in your browser.

Response Format

All chunking endpoints return a list of chunk objects:

[
  {
    "text": "chunk content",
    "start_index": 0,
    "end_index": 42,
    "token_count": 8
  }
]

Submit a list of strings instead of a single string to get back a list of lists — one inner list per input document.

Chunkers

Token Chunker

POST /v1/chunk/token Splits text into fixed-size token windows. The fastest and most predictable chunker.

curl -X POST http://localhost:8000/v1/chunk/token \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Your text here...",
    "chunk_size": 512,
    "chunk_overlap": 50
  }'

Parameters

text

string | string[]

required

Text or list of texts to chunk.

tokenizer

string

default:"character"

Tokenizer to use. Options: "character", "gpt2", "cl100k_base", or any HuggingFace tokenizer name.

chunk_size

integer

default:"512"

Maximum tokens per chunk.

chunk_overlap

integer

default:"0"

Token overlap between consecutive chunks.

Sentence Chunker

POST /v1/chunk/sentence Groups sentences into chunks while respecting a token-size limit. Preserves sentence boundaries — no mid-sentence splits.

curl -X POST http://localhost:8000/v1/chunk/sentence \
  -H "Content-Type: application/json" \
  -d '{
    "text": "First sentence. Second sentence. Third sentence.",
    "chunk_size": 256,
    "min_sentences_per_chunk": 2
  }'

Parameters

text

string | string[]

required

Text or list of texts to chunk.

tokenizer

string

default:"character"

Tokenizer to use.

chunk_size

integer

default:"512"

Maximum tokens per chunk.

chunk_overlap

integer

default:"0"

Token overlap between chunks.

min_sentences_per_chunk

integer

default:"1"

Minimum sentences to include in each chunk.

min_characters_per_sentence

integer

default:"12"

Minimum characters required to count as a sentence.

approximate

boolean

default:"false"

Use approximate token counting for faster processing.

delim

string | string[]

default:"[\"\\n\", \". \", \"! \", \"? \"]"

Sentence delimiter(s).

include_delim

"prev" | "next"

default:"\"prev\""

Attach the delimiter to the previous ("prev") or next ("next") sentence.

Recursive Chunker

POST /v1/chunk/recursive Splits text using a hierarchy of separators defined by a named recipe. Great for structured text like Markdown or code. Chunker instances are cached per (recipe, lang, tokenizer) for speed.

curl -X POST http://localhost:8000/v1/chunk/recursive \
  -H "Content-Type: application/json" \
  -d '{
    "text": "# Heading\n\nParagraph one.\n\nParagraph two.",
    "chunk_size": 256,
    "recipe": "markdown"
  }'

Parameters

text

string | string[]

required

Text or list of texts to chunk.

tokenizer

string

default:"character"

Tokenizer to use.

chunk_size

integer

default:"512"

Maximum tokens per chunk.

recipe

string

default:"\"default\""

Named splitting recipe. Options: "default" (paragraph → sentence → word), "markdown", "python", "js".

lang

string

default:"\"en\""

Language hint for the recipe.

min_characters_per_chunk

integer

default:"24"

Minimum characters to include in a chunk.

Semantic Chunker

POST /v1/chunk/semantic Splits where semantic similarity between adjacent sentences drops below a threshold. Produces topically coherent chunks. Requires the semantic extra.

curl -X POST http://localhost:8000/v1/chunk/semantic \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Dogs are loyal and friendly pets. Cats are independent animals. Quantum physics studies subatomic particles.",
    "embedding_model": "minishlab/potion-base-8M",
    "threshold": 0.5
  }'

Parameters

text

string | string[]

required

Text or list of texts to chunk.

embedding_model

string

default:"\"minishlab/potion-base-8M\""

Sentence-embedding model for computing similarity. Any model compatible with sentence-transformers works.

threshold

float

default:"0.5"

Cosine-similarity threshold for splitting (0.0–1.0). Lower values produce larger, fewer chunks.

chunk_size

integer

default:"512"

Maximum tokens per chunk.

similarity_window

integer

default:"3"

Number of surrounding sentences to consider when computing similarity.

min_sentences_per_chunk

integer

default:"1"

Minimum sentences per chunk.

min_characters_per_sentence

integer

default:"12"

Minimum characters per sentence.

Code Chunker

POST /v1/chunk/code Splits source code at syntactic boundaries using AST parsing. Never breaks inside a function or class. Requires the code extra.

curl -X POST http://localhost:8000/v1/chunk/code \
  -H "Content-Type: application/json" \
  -d '{
    "text": "def hello():\n    print(\"Hello\")\n\ndef world():\n    print(\"World\")",
    "language": "python",
    "chunk_size": 100
  }'

Parameters

text

string | string[]

required

Source code or list of source code snippets to chunk.

tokenizer

string

default:"character"

Tokenizer to use.

chunk_size

integer

default:"512"

Maximum tokens per chunk.

language

string

default:"\"python\""

Programming language. Supported: "python", "javascript", "typescript", "java", "go", "rust", "c", "cpp", and more.

include_nodes

boolean

default:"false"

Include AST node metadata (node type, line numbers) in the chunk output.

Refineries

Refineries enrich an existing list of chunks. Pass the output of any chunker endpoint directly into a refinery.

Overlap Refinery

POST /v1/refine/overlap Appends or prepends overlapping context from neighbouring chunks. Useful when downstream consumers need continuity across chunk boundaries.

curl -X POST http://localhost:8000/v1/refine/overlap \
  -H "Content-Type: application/json" \
  -d '{
    "chunks": [
      {"text": "First chunk.", "start_index": 0, "end_index": 12, "token_count": 3},
      {"text": "Second chunk.", "start_index": 13, "end_index": 26, "token_count": 3}
    ],
    "context_size": 0.25,
    "method": "suffix"
  }'

Parameters

chunks

Chunk[]

required

List of chunk objects from any chunker endpoint. Each must contain text, start_index, end_index, and token_count.

tokenizer

string

default:"character"

Tokenizer to use.

context_size

float | integer

default:"0.25"

Size of the overlap context. A float (0–1) is treated as a fraction of the chunk size; an integer is an absolute token count.

mode

"token" | "recursive"

default:"\"token\""

Strategy used to create the overlap window.

method

"suffix" | "prefix" | "justified"

default:"\"suffix\""

"suffix" appends context from the next chunk; "prefix" prepends context from the previous chunk; "justified" adds context from both sides.

merge

boolean

default:"true"

Merge the overlap context into the chunk text field.

Embeddings Refinery

POST /v1/refine/embeddings Computes and attaches embeddings to each chunk via Chonkie’s AutoEmbeddings. Each chunk in the response gains an embedding field containing a list of floats. Local models (e.g. minishlab/potion-base-8M) run entirely on-device and require no API key. API-based models require the appropriate environment variable for your provider.

# Local model (no API key required)
curl -X POST http://localhost:8000/v1/refine/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "chunks": [
      {"text": "First chunk.", "start_index": 0, "end_index": 12, "token_count": 3},
      {"text": "Second chunk.", "start_index": 13, "end_index": 26, "token_count": 3}
    ],
    "embedding_model": "minishlab/potion-base-8M"
  }'

# OpenAI (requires OPENAI_API_KEY)
curl -X POST http://localhost:8000/v1/refine/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "chunks": [
      {"text": "First chunk.", "start_index": 0, "end_index": 12, "token_count": 3}
    ],
    "embedding_model": "text-embedding-3-small"
  }'

Embeddings Providers

Type	Example Model	Requirement
Local (model2vec)	`minishlab/potion-base-8M`, `minishlab/potion-retrieval-32M`	None
OpenAI	`text-embedding-3-small`, `text-embedding-3-large`	`OPENAI_API_KEY`
Cohere	`embed-english-v3.0`, `embed-multilingual-v3.0`	`COHERE_API_KEY`
Voyage AI	`voyage-large-2`, `voyage-code-2`	`VOYAGE_API_KEY`

Parameters

chunks

Chunk[]

required

List of chunk objects to embed.

embedding_model

string

default:"\"minishlab/potion-retrieval-32M\""

Embedding model name. Local model2vec models (e.g. minishlab/potion-base-8M) require no API key. For API-based models, set the appropriate environment variable for your provider.

Batch Processing

Send a list of strings to process multiple documents in one request:

curl -X POST http://localhost:8000/v1/chunk/token \
  -H "Content-Type: application/json" \
  -d '{
    "text": ["First document.", "Second document.", "Third document."],
    "chunk_size": 512
  }'

The response is a list of lists — one inner list of chunks per input document:

[
  [{"text": "First document.", "start_index": 0, "end_index": 15, "token_count": 3}],
  [{"text": "Second document.", "start_index": 0, "end_index": 16, "token_count": 3}],
  [{"text": "Third document.", "start_index": 0, "end_index": 15, "token_count": 3}]
]

Chaining Chunkers and Refineries

Pipeline example — chunk semantically, then add overlap context:

import requests

BASE = "http://localhost:8000"

# Step 1: chunk
chunks = requests.post(f"{BASE}/v1/chunk/semantic", json={
    "text": "Your long document here...",
    "threshold": 0.5,
}).json()

# Step 2: add overlap
enriched = requests.post(f"{BASE}/v1/refine/overlap", json={
    "chunks": chunks,
    "context_size": 0.2,
}).json()

# Step 3: embed (requires OPENAI_API_KEY)
embedded = requests.post(f"{BASE}/v1/refine/embeddings", json={
    "chunks": enriched,
    "embedding_model": "text-embedding-3-small",
}).json()

Error Handling

Status	Meaning
`200`	Success
`400`	Invalid request parameters or chunk format
`500`	Internal error (missing extras, model loading failure, etc.)

Error responses follow FastAPI’s standard format:

{
  "detail": "SemanticChunker requires the 'semantic' extra. Install it with: pip install 'chonkie[semantic]'"
}

Health & Info

# Health check (used by load balancers and container orchestrators)
curl http://localhost:8000/health
# {"status": "ok"}

# API info
curl http://localhost:8000/
# {"name": "Chonkie OSS API", "version": "...", "docs": "/docs", ...}

​Response Format

​Chunkers

​Token Chunker

​Sentence Chunker

​Recursive Chunker

​Semantic Chunker

​Code Chunker

​Refineries

​Overlap Refinery

​Embeddings Refinery

​Embeddings Providers

​Batch Processing

​Chaining Chunkers and Refineries

​Error Handling

​Health & Info

Response Format

Chunkers

Token Chunker

Sentence Chunker

Recursive Chunker

Semantic Chunker

Code Chunker

Refineries

Overlap Refinery

Embeddings Refinery

Embeddings Providers

Batch Processing

Chaining Chunkers and Refineries

Error Handling

Health & Info