The WeaviateHandshake class provides seamless integration between Chonkie’s chunking system and Weaviate, a powerful vector database. Embed and store your Chonkie chunks in Weaviate without ever leaving the Chonkie SDK.

Installation

Before using the Weaviate handshake, make sure to install the required dependencies:
pip install chonkie[weaviate]

Basic Usage

Initialization

from chonkie import WeaviateHandshake

# Initialize with default settings (local Weaviate)
handshake = WeaviateHandshake()

# Or connect to a Weaviate server
handshake = WeaviateHandshake(url="http://localhost:8080")

# Or use an existing Weaviate client
import weaviate
client = weaviate.connect_to_local()
handshake = WeaviateHandshake(client=client, collection_name="my_collection")

# For Weaviate Cloud
handshake = WeaviateHandshake(
	url="YOUR_CLOUD_URL",
	api_key="YOUR_API_KEY"
)

Writing Chunks to Weaviate

from chonkie import WeaviateHandshake, SemanticChunker

# Initialize the handshake
handshake = WeaviateHandshake(collection_name="my_documents")

# Create some chunks
chunker = SemanticChunker()
chunks = chunker.chunk("Chonkie loves to chonk your texts!")

# Write chunks to Weaviate
handshake.write(chunks)

Parameters

client
Optional[weaviate.Client]
default:"None"
Weaviate client instance. If not provided, a new client will be created based on other parameters.
collection_name
Union[str, Literal['random']]
default:"random"
Name of the collection to use. If “random”, a unique name will be generated.
embedding_model
Union[str, BaseEmbeddings]
default:"minishlab/potion-retrieval-32M"
Embedding model to use. Can be a model name or a BaseEmbeddings instance.
url
Optional[str]
default:"None"
URL of the Weaviate server. If provided, will connect to this server.
api_key
Optional[str]
default:"None"
API key for Weaviate Cloud authentication.
auth_config
Optional[Dict[str, Any]]
default:"None"
OAuth configuration for authentication (optional).
batch_size
int
default:"100"
Batch size for batch operations.
batch_dynamic
bool
default:"True"
Whether to use dynamic batching.
batch_timeout_retries
int
default:"3"
Number of retries for batch timeouts.
additional_headers
Optional[Dict[str, str]]
default:"None"
Additional headers for the Weaviate client.