FT.CREATE
Creates an index and initiates a backfill of that index. For more information, see Vector search overview for details on index construction.
Syntax
FT.CREATE <index-name> ON HASH | JSON [PREFIX <count> <prefix1> [<prefix2>...]] SCHEMA (<field-identifier> [AS <alias>] NUMERIC | TAG [SEPARATOR <sep>] [CASESENSITIVE] | TEXT | VECTOR [HNSW|FLAT] <attr_count> [<attribute_name> <attribute_value>]) )+
Schema
Field identifier:
For hash keys, field identifier is A field name.
For JSON keys, field identifier is A JSON path.
For more information, see Index field types.
Field types:
TAG: For more information, see Tags
. NUMERIC: Field contains a number.
TEXT: Field contains any blob of data.
VECTOR: vector field that supports vector search.
Algorithm – can be HNSW (Hierarchical Navigable Small World) or FLAT (brute force).
attr_count
– number of attributes that will be passed as algorithm configuration, this includes both names and values.{attribute_name} {attribute_value}
– algorithm-specific key/value pairs that define index configuration.For FLAT algorithm, attributes are:
Required:
DIM – Number of dimensions in the vector.
DISTANCE_METRIC – Can be one of [L2 | IP | COSINE].
TYPE – Vector type. The only supported type is
FLOAT32
.
Optional:
INITIAL_CAP – Initial vector capacity in the index affecting memory allocation size of the index.
For HNSW algorithm, attributes are:
Required:
TYPE – Vector type. The only supported type is
FLOAT32
.DIM – Vector dimension, specified as a positive integer. Maximum: 32768
DISTANCE_METRIC – Can be one of [L2 | IP | COSINE].
Optional:
INITIAL_CAP – Initial vector capacity in the index affecting memory allocation size of the index. Defaults to 1024.
M – Number of maximum allowed outgoing edges for each node in the graph in each layer. on layer zero the maximal number of outgoing edges will be 2M. Default is 16 Maximum is 512.
EF_CONSTRUCTION – controls the number of vectors examined during index construction. Higher values for this parameter will improve recall ratio at the expense of longer index creation times. Default value is 200. Maximum value is 4096.
EF_RUNTIME – controls the number of vectors examined during query operations. Higher values for this parameter can yield improved recall at the expense of longer query times. The value of this parameter can be overriden on a per-query basis. Default value is 10. Maximum value is 4096.
Return
Returns a simple string OK message or error reply.
Examples
Note
The following example uses arguments native to valkey-cli
Example 1: Create some indexes
Create an index for vectors of size 2
FT.CREATE hash_idx1 ON HASH PREFIX 1 hash: SCHEMA vec AS VEC VECTOR HNSW 6 DIM 2 TYPE FLOAT32 DISTANCE_METRIC L2 OK
Create a 6-dimensional JSON index using the HNSW algorithm:
FT.CREATE json_idx1 ON JSON PREFIX 1 json: SCHEMA $.vec AS VEC VECTOR HNSW 6 DIM 6 TYPE FLOAT32 DISTANCE_METRIC L2 OK
Example 2: Populate some data
The following commands are formatted so they can be executed as arguments to the redis-cli terminal program. Developers using programming-language clients (such Python, Ruby, C#, etc.) will need to follow their environment's handling rules for dealing with strings and binary data.
Creating some hash and json data:
HSET hash:0 vec "\x00\x00\x00\x00\x00\x00\x00\x00" HSET hash:1 vec "\x00\x00\x00\x00\x00\x00\x80\xbf" JSON.SET json:0 . '{"vec":[1,2,3,4,5,6]}' JSON.SET json:1 . '{"vec":[10,20,30,40,50,60]}' JSON.SET json:2 . '{"vec":[1.1,1.2,1.3,1.4,1.5,1.6]}'
Note the following:
The keys of the hash and JSON data have the prefixes of their index definitions.
The vectors are at the appropriate paths of the index definitions.
The hash vectors are entered as hex data while the JSON data is entered as numbers.
The vectors are the appropriate lengths, the two-dimensional hash vector entries have two floats worth of hex data, the six-dimensional json vector entries have six numbers.
Example 3: Delete and re-create an index
FT.DROPINDEX json_idx1 OK FT.CREATE json_idx1 ON JSON PREFIX 1 json: SCHEMA $.vec AS VEC VECTOR FLAT 6 DIM 6 TYPE FLOAT32 DISTANCE_METRIC L2 OK
Note the new JSON index uses the FLAT
algorithm instead of the HNSW
algorithm. Also note that it will re-index the existing JSON data:
FT.SEARCH json_idx1 "*=>[KNN 100 @VEC $query_vec]" PARAMS 2 query_vec "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" DIALECT 2 1) (integer) 3 2) "json:2" 3) 1) "__VEC_score" 2) "11.11" 3) "$" 4) "[{\"vec\":[1.1, 1.2, 1.3, 1.4, 1.5, 1.6]}]" 4) "json:0" 5) 1) "__VEC_score" 2) "91" 3) "$" 4) "[{\"vec\":[1.0, 2.0, 3.0, 4.0, 5.0, 6.0]}]" 6) "json:1" 7) 1) "__VEC_score" 2) "9100" 3) "$" 4) "[{\"vec\":[10.0, 20.0, 30.0, 40.0, 50.0, 60.0]}]"