Skip to main content

1. Install

pip install "chonkie[api,semantic,code,openai]"
The api extra includes FastAPI and uvicorn. Add semantic for the semantic chunker and code for the code chunker. The embeddings refinery works out of the box with local models (e.g. minishlab/potion-base-8M); add the relevant extra (e.g. openai) only if you plan to use API-based embedding providers.

2. Start the Server

chonkie serve
# 🦛 Starting Chonkie API server on http://0.0.0.0:8000
# 📚 API docs available at http://0.0.0.0:8000/docs
# 🔍 Log level: info
#
# Press CTRL+C to stop the server
Visit http://localhost:8000/docs for the interactive Swagger UI, or http://localhost:8000/redoc for ReDoc.

3. Make Your First Request

curl -X POST http://localhost:8000/v1/chunk/token \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Chonkie makes chunking easy. It splits text into manageable pieces for RAG pipelines.",
    "chunk_size": 20
  }'
Response:
[
  {
    "text": "Chonkie makes chunking easy.",
    "start_index": 0,
    "end_index": 28,
    "token_count": 5
  },
  {
    "text": "It splits text into manageable pieces for RAG pipelines.",
    "start_index": 29,
    "end_index": 85,
    "token_count": 10
  }
]

Or Use Docker

docker compose up
The server starts on port 8000. See the Docker guide for the full docker-compose.yml and production setup.

Server Options

FlagDefaultDescription
--host0.0.0.0Bind address
--port8000Port number
--reloadfalseAuto-reload on file changes (development only)
--log-levelinfoLog verbosity: debug, info, warning, error

Next Steps