AWS::Bedrock::DataSource ChunkingConfiguration
Details about how to chunk the documents in the data source. A chunk refers to an excerpt from a data source that is returned when the knowledge base that it belongs to is queried.
Syntax
To declare this entity in your AWS CloudFormation template, use the following syntax:
JSON
{ "ChunkingStrategy" :
String
, "FixedSizeChunkingConfiguration" :FixedSizeChunkingConfiguration
, "HierarchicalChunkingConfiguration" :HierarchicalChunkingConfiguration
, "SemanticChunkingConfiguration" :SemanticChunkingConfiguration
}
YAML
ChunkingStrategy:
String
FixedSizeChunkingConfiguration:FixedSizeChunkingConfiguration
HierarchicalChunkingConfiguration:HierarchicalChunkingConfiguration
SemanticChunkingConfiguration:SemanticChunkingConfiguration
Properties
ChunkingStrategy
-
Knowledge base can split your source data into chunks. A chunk refers to an excerpt from a data source that is returned when the knowledge base that it belongs to is queried. You have the following options for chunking your data. If you opt for
NONE
, then you may want to pre-process your files by splitting them up such that each file corresponds to a chunk.-
FIXED_SIZE
– Amazon Bedrock splits your source data into chunks of the approximate size that you set in thefixedSizeChunkingConfiguration
. -
HIERARCHICAL
– Split documents into layers of chunks where the first layer contains large chunks, and the second layer contains smaller chunks derived from the first layer. -
SEMANTIC
– Split documents into chunks based on groups of similar content derived with natural language processing. -
NONE
– Amazon Bedrock treats each file as one chunk. If you choose this option, you may want to pre-process your documents by splitting them into separate files.
Required: Yes
Type: String
Allowed values:
FIXED_SIZE | NONE | HIERARCHICAL | SEMANTIC
Update requires: Replacement
-
FixedSizeChunkingConfiguration
-
Configurations for when you choose fixed-size chunking. If you set the
chunkingStrategy
asNONE
, exclude this field.Required: No
Type: FixedSizeChunkingConfiguration
Update requires: Replacement
HierarchicalChunkingConfiguration
-
Settings for hierarchical document chunking for a data source. Hierarchical chunking splits documents into layers of chunks where the first layer contains large chunks, and the second layer contains smaller chunks derived from the first layer.
Required: No
Type: HierarchicalChunkingConfiguration
Update requires: Replacement
SemanticChunkingConfiguration
-
Settings for semantic document chunking for a data source. Semantic chunking splits a document into into smaller documents based on groups of similar content derived from the text with natural language processing.
Required: No
Type: SemanticChunkingConfiguration
Update requires: Replacement