Chunking
This is where all chunking strategies on ParsedDocuments are written. Chunking strategies are strategies to best store the ParsedDocuments in a vector store or for LLM processing.
BaseChunker
Bases: ABC
Base class for all Chunker implementations.
Source code in supermat/core/chunking/base.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
|
create_chunks(processed_document)
abstractmethod
Build chunks from the given ParsedDocument into list of ChunkDocuments. This is the public class that is called for any chunking strategy.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
processed_document |
ParsedDocumentType
|
The processed document that needs to split into chunks. |
required |
Returns:
Name | Type | Description |
---|---|---|
DocumentChunksType |
DocumentChunksType
|
The chunks built by the given strategy. |
Source code in supermat/core/chunking/base.py
16 17 18 19 20 21 22 23 24 25 26 |
|