Skip to main content

Chonkie CLI

Chonkie provides a powerful Command Line Interface (CLI) to perform chunking and run pipelines directly from your terminal.

Installation

The CLI is included with the default chonkie installation:
pip install chonkie

Basic Usage

The CLI provides a single chonkie command with two primary subcommands:
  1. chunk – Quickly chunk text or files.
  2. pipeline – Run full Chonkie pipelines (fetch → chef → chunk → refine → handbook).
To see available options and usage details, use the help flags:
chonkie --help

# Usage: chonkie [OPTIONS] COMMAND [ARGS]...
#
# > 🦛 CHONK your texts with Chonkie
#
# ╭─ Options ──────────────────────────────────────────────────────────────────────────────────────────────────────╮
# │ --install-completion          Install completion for the current shell.                                        │
# │ --show-completion             Show completion for the current shell, to copy it or customize the installation. │
# │ --help                        Show this message and exit.                                                      │
# ╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
#
# ╭─ Commands ─────────────────────────────────────────────────────────────────────────────────────────────────────╮
# │ chunk      Chunk text using a specified chunker and optionally store it.                                       │
# │ pipeline   Run a processing pipeline on text or files.                                                         │
# ╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Chunking Texts or Files

Use the chunk command to quickly chunk text or a single file. Syntax:
chonkie chunk [TEXT_OR_PATH] [OPTIONS]
Options:
  • --chunker: The chunking method to use (default: semantic). Options: semantic, token, sentence, recursive, etc.
  • --handshaker: Optional storage backend to export chunks.
Examples:
# Chunk raw text
chonkie chunk "This is a long text that needs chunking..." --chunker token

# Chunk a file
chonkie chunk README.md --chunker sentence

# Chunk and store in a vector DB (e.g., Chroma)
chonkie chunk document.txt --handshaker chroma

Running Pipelines

The pipeline command is more powerful and supports processing directories, applying chefs/refiners, and exporting data. Syntax:
chonkie pipeline [TEXT_OR_PATH] [OPTIONS]
Core Options:
  • --d: Directory to process (mutually exclusive with text/file argument).
  • --ext: File extensions to include when processing a directory (e.g., .md, .txt). Can be used multiple times.
  • --chunker: Chunking method (default: semantic).
  • --chef: Preprocessor to use (e.g., text, markdown).
  • --refiner: Optional refinement strategy (e.g., overlap).
  • --handshaker: Optional destination storage.
Examples:

1. Process a Directory

Process all markdown and text files in the docs directory:
chonkie pipeline --d docs --ext .md --ext .txt --chunker recursive

2. Process a Single File

Run a pipeline on a single file:
chonkie pipeline README.md --chunker token --chef text

3. Full RAG Pipeline

Run a full RAG pipeline: fetch from directory -> process markdown -> chunk recursively -> export to ChromaDB.
chonkie pipeline \
  --d ./knowledge_base \
  --ext .md \
  --chef markdown \
  --chunker recursive \
  --handshaker chroma

Tips

  • Use --help on any command to see full options: chonkie pipeline --help.
  • Directory processing recursively walks subdirectories.
  • Output is printed to stdout by default unless a handshaker is specified.