Chonkie Documentation

Python

from chonkie.cloud import SDPMChunker 

chunker = SDPMChunker(api_key="{api_key}") 

chunks = chunker(text="YOUR_TEXT")

[
  {
    "text": "<string>",
    "start_index": 123,
    "end_index": 123,
    "token_count": 123,
    "sentences": [
      {
        "text": "<string>",
        "start_index": 123,
        "end_index": 123,
        "token_count": 123,
        "embedding": [
          123
        ]
      }
    ]
  }
]

Authorizations

Authorization

string

header

required

Your API Key from the Chonkie Cloud dashboard

Body

multipart/form-data

Response

200 - application/json

Successful Response: A list of SemanticChunk objects.

A list containing SemanticChunk objects (as SDPM uses semantic chunking), detailing segments and sentences with optional embeddings.

Semantic Chunker

Late Chunker

Python

from chonkie.cloud import SDPMChunker 

chunker = SDPMChunker(api_key="{api_key}") 

chunks = chunker(text="YOUR_TEXT")

[
  {
    "text": "<string>",
    "start_index": 123,
    "end_index": 123,
    "token_count": 123,
    "sentences": [
      {
        "text": "<string>",
        "start_index": 123,
        "end_index": 123,
        "token_count": 123,
        "embedding": [
          123
        ]
      }
    ]
  }
]

API Reference

SDPM Chunker

Authorizations

Body

Response