Skip to main content
Start the server and visit http://localhost:8000/docs for an interactive Swagger UI where you can try every endpoint directly in your browser.

Response Format

All chunking endpoints return a list of chunk objects:
[
  {
    "text": "chunk content",
    "start_index": 0,
    "end_index": 42,
    "token_count": 8
  }
]
Submit a list of strings instead of a single string to get back a list of lists — one inner list per input document.

Chunkers

Token Chunker

POST /v1/chunk/token Splits text into fixed-size token windows. The fastest and most predictable chunker.
curl -X POST http://localhost:8000/v1/chunk/token \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Your text here...",
    "chunk_size": 512,
    "chunk_overlap": 50
  }'
text
string | string[]
required
Text or list of texts to chunk.
tokenizer
string
default:"character"
Tokenizer to use. Options: "character", "gpt2", "cl100k_base", or any HuggingFace tokenizer name.
chunk_size
integer
default:"512"
Maximum tokens per chunk.
chunk_overlap
integer
default:"0"
Token overlap between consecutive chunks.

Sentence Chunker

POST /v1/chunk/sentence Groups sentences into chunks while respecting a token-size limit. Preserves sentence boundaries — no mid-sentence splits.
curl -X POST http://localhost:8000/v1/chunk/sentence \
  -H "Content-Type: application/json" \
  -d '{
    "text": "First sentence. Second sentence. Third sentence.",
    "chunk_size": 256,
    "min_sentences_per_chunk": 2
  }'
text
string | string[]
required
Text or list of texts to chunk.
tokenizer
string
default:"character"
Tokenizer to use.
chunk_size
integer
default:"512"
Maximum tokens per chunk.
chunk_overlap
integer
default:"0"
Token overlap between chunks.
min_sentences_per_chunk
integer
default:"1"
Minimum sentences to include in each chunk.
min_characters_per_sentence
integer
default:"12"
Minimum characters required to count as a sentence.
approximate
boolean
default:"false"
Use approximate token counting for faster processing.
delim
string | string[]
default:"[\"\\n\", \". \", \"! \", \"? \"]"
Sentence delimiter(s).
include_delim
"prev" | "next"
default:"\"prev\""
Attach the delimiter to the previous ("prev") or next ("next") sentence.

Recursive Chunker

POST /v1/chunk/recursive Splits text using a hierarchy of separators defined by a named recipe. Great for structured text like Markdown or code. Chunker instances are cached per (recipe, lang, tokenizer) for speed.
curl -X POST http://localhost:8000/v1/chunk/recursive \
  -H "Content-Type: application/json" \
  -d '{
    "text": "# Heading\n\nParagraph one.\n\nParagraph two.",
    "chunk_size": 256,
    "recipe": "markdown"
  }'
text
string | string[]
required
Text or list of texts to chunk.
tokenizer
string
default:"character"
Tokenizer to use.
chunk_size
integer
default:"512"
Maximum tokens per chunk.
recipe
string
default:"\"default\""
Named splitting recipe. Options: "default" (paragraph → sentence → word), "markdown", "python", "js".
lang
string
default:"\"en\""
Language hint for the recipe.
min_characters_per_chunk
integer
default:"24"
Minimum characters to include in a chunk.

Semantic Chunker

POST /v1/chunk/semantic Splits where semantic similarity between adjacent sentences drops below a threshold. Produces topically coherent chunks. Requires the semantic extra.
curl -X POST http://localhost:8000/v1/chunk/semantic \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Dogs are loyal and friendly pets. Cats are independent animals. Quantum physics studies subatomic particles.",
    "embedding_model": "minishlab/potion-base-8M",
    "threshold": 0.5
  }'
text
string | string[]
required
Text or list of texts to chunk.
embedding_model
string
default:"\"minishlab/potion-base-8M\""
Sentence-embedding model for computing similarity. Any model compatible with sentence-transformers works.
threshold
float
default:"0.5"
Cosine-similarity threshold for splitting (0.0–1.0). Lower values produce larger, fewer chunks.
chunk_size
integer
default:"512"
Maximum tokens per chunk.
similarity_window
integer
default:"3"
Number of surrounding sentences to consider when computing similarity.
min_sentences_per_chunk
integer
default:"1"
Minimum sentences per chunk.
min_characters_per_sentence
integer
default:"12"
Minimum characters per sentence.

Code Chunker

POST /v1/chunk/code Splits source code at syntactic boundaries using AST parsing. Never breaks inside a function or class. Requires the code extra.
curl -X POST http://localhost:8000/v1/chunk/code \
  -H "Content-Type: application/json" \
  -d '{
    "text": "def hello():\n    print(\"Hello\")\n\ndef world():\n    print(\"World\")",
    "language": "python",
    "chunk_size": 100
  }'
text
string | string[]
required
Source code or list of source code snippets to chunk.
tokenizer
string
default:"character"
Tokenizer to use.
chunk_size
integer
default:"512"
Maximum tokens per chunk.
language
string
default:"\"python\""
Programming language. Supported: "python", "javascript", "typescript", "java", "go", "rust", "c", "cpp", and more.
include_nodes
boolean
default:"false"
Include AST node metadata (node type, line numbers) in the chunk output.

Refineries

Refineries enrich an existing list of chunks. Pass the output of any chunker endpoint directly into a refinery.

Overlap Refinery

POST /v1/refine/overlap Appends or prepends overlapping context from neighbouring chunks. Useful when downstream consumers need continuity across chunk boundaries.
curl -X POST http://localhost:8000/v1/refine/overlap \
  -H "Content-Type: application/json" \
  -d '{
    "chunks": [
      {"text": "First chunk.", "start_index": 0, "end_index": 12, "token_count": 3},
      {"text": "Second chunk.", "start_index": 13, "end_index": 26, "token_count": 3}
    ],
    "context_size": 0.25,
    "method": "suffix"
  }'
chunks
Chunk[]
required
List of chunk objects from any chunker endpoint. Each must contain text, start_index, end_index, and token_count.
tokenizer
string
default:"character"
Tokenizer to use.
context_size
float | integer
default:"0.25"
Size of the overlap context. A float (0–1) is treated as a fraction of the chunk size; an integer is an absolute token count.
mode
"token" | "recursive"
default:"\"token\""
Strategy used to create the overlap window.
method
"suffix" | "prefix"
default:"\"suffix\""
"suffix" appends context from the previous chunk; "prefix" prepends context from the next chunk.
merge
boolean
default:"true"
Merge the overlap context into the chunk text field.

Embeddings Refinery

POST /v1/refine/embeddings Computes and attaches embeddings to each chunk via Chonkie’s AutoEmbeddings. Each chunk in the response gains an embedding field containing a list of floats. Local models (e.g. minishlab/potion-base-8M) run entirely on-device and require no API key. API-based models require the appropriate environment variable for your provider.
# Local model (no API key required)
curl -X POST http://localhost:8000/v1/refine/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "chunks": [
      {"text": "First chunk.", "start_index": 0, "end_index": 12, "token_count": 3},
      {"text": "Second chunk.", "start_index": 13, "end_index": 26, "token_count": 3}
    ],
    "embedding_model": "minishlab/potion-base-8M"
  }'

# OpenAI (requires OPENAI_API_KEY)
curl -X POST http://localhost:8000/v1/refine/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "chunks": [
      {"text": "First chunk.", "start_index": 0, "end_index": 12, "token_count": 3}
    ],
    "embedding_model": "text-embedding-3-small"
  }'

Embeddings Providers

TypeExample ModelRequirement
Local (model2vec)minishlab/potion-base-8M, minishlab/potion-retrieval-32MNone
OpenAItext-embedding-3-small, text-embedding-3-largeOPENAI_API_KEY
Cohereembed-english-v3.0, embed-multilingual-v3.0COHERE_API_KEY
Voyage AIvoyage-large-2, voyage-code-2VOYAGE_API_KEY
chunks
Chunk[]
required
List of chunk objects to embed.
embedding_model
string
default:"\"minishlab/potion-retrieval-32M\""
Embedding model name. Local model2vec models (e.g. minishlab/potion-base-8M) require no API key. For API-based models, set the appropriate environment variable for your provider.

Batch Processing

Send a list of strings to process multiple documents in one request:
curl -X POST http://localhost:8000/v1/chunk/token \
  -H "Content-Type: application/json" \
  -d '{
    "text": ["First document.", "Second document.", "Third document."],
    "chunk_size": 512
  }'
The response is a list of lists — one inner list of chunks per input document:
[
  [{"text": "First document.", "start_index": 0, "end_index": 15, "token_count": 3}],
  [{"text": "Second document.", "start_index": 0, "end_index": 16, "token_count": 3}],
  [{"text": "Third document.", "start_index": 0, "end_index": 15, "token_count": 3}]
]

Chaining Chunkers and Refineries

Pipeline example — chunk semantically, then add overlap context:
import requests

BASE = "http://localhost:8000"

# Step 1: chunk
chunks = requests.post(f"{BASE}/v1/chunk/semantic", json={
    "text": "Your long document here...",
    "threshold": 0.5,
}).json()

# Step 2: add overlap
enriched = requests.post(f"{BASE}/v1/refine/overlap", json={
    "chunks": chunks,
    "context_size": 0.2,
}).json()

# Step 3: embed (requires OPENAI_API_KEY)
embedded = requests.post(f"{BASE}/v1/refine/embeddings", json={
    "chunks": enriched,
    "embedding_model": "text-embedding-3-small",
}).json()

Error Handling

StatusMeaning
200Success
400Invalid request parameters or chunk format
500Internal error (missing extras, model loading failure, etc.)
Error responses follow FastAPI’s standard format:
{
  "detail": "SemanticChunker requires the 'semantic' extra. Install it with: pip install 'chonkie[semantic]'"
}

Health & Info

# Health check (used by load balancers and container orchestrators)
curl http://localhost:8000/health
# {"status": "ok"}

# API info
curl http://localhost:8000/
# {"name": "Chonkie OSS API", "version": "...", "docs": "/docs", ...}