Authorizations
Your API Key from the Chonkie Cloud dashboard
Body
The file to chunk.
SentenceTransformer model identifier to use for embedding.
Maximum number of tokens per chunk.
Pre-defined recursive rules for splitting. Find all recipes on our Hugging Face Hub.
Language of the text, used with recipes. Must match the language of the recipe.
Minimum number of characters per chunk.
Response
Successful Response: A list of LateChunk
objects.
A list containing LateChunk
objects, detailing segments, sentences, and an optional chunk-level embedding derived from the full document.
The actual text content of the chunk.
The starting character index of the chunk within the original input text.
The ending character index (exclusive) of the chunk within the original input text.
The number of tokens in this specific chunk, according to the tokenizer used.
Embedding vector (list of floats) for the entire chunk, derived from the full document embedding.