Split text into chunks based on a late-bound token count
RecursiveChunker
and uses document-level embeddings to create more semantically rich chunk representations.
Instead of generating embeddings for each chunk independently, the LateChunker first encodes the entire text into a single embedding.
It then splits the text using recursive rules and derives each chunk’s embedding by averaging relevant parts of the
full document embedding. This allows each chunk to carry broader contextual information,
improving retrieval performance in RAG systems.
LateChunker
via the API, check out the API reference documentation.
sentence-transformers
library to be installed, and currently only supports SentenceTransformer models.
You can install it with:
The LateChunker uses RecursiveRules
to determine how to chunk the text.
The rules are a list of RecursiveLevel
objects, which define the delimiters and whitespace rules for each level of the recursive tree.
Find more information about the rules in the Additional Information section.
LateChunk
objects with optimized storage using slots:
RecursiveRules
class to determine the chunking rules.
The rules are a list of RecursiveLevel
objects, which define the delimiters and whitespace rules for each level of the recursive tree.
RecursiveLevel
expects the list of custom delimiters to not include whitespace.
If whitespace as a delimiter is required, you can set the whitespace
parameter in the RecursiveLevel
class to True.
Note that if whitespace = True
, you cannot pass a list of custom delimiters.