Using Dynamic Fields in Amazon CloudSearch
fields provide a way to index documents without knowing in advance exactly what fields they
contain. For example, consider the case where you want to search a set of
products. You might not know the names of all of the possible product attributes across all
product categories, but you can structure your data so that all text-based attributes are
stored in fields that end in
_t, and all integer values are stored in fields
that end in
_i. With dynamic fields, you can map the attribute fields to the
appropriate field type without having to configure a field for every possible attribute.
This reduces the amount of configuration that you need to do up front, and eliminates the
need to modify your domain configuration every time a product with a new attribute is added.
You can also use dynamic fields to essentially ignore new fields by mapping them to a field
that is not searchable or returnable.
Configuring Dynamic Fields in Amazon CloudSearch
You designate a field as a dynamic field by specifying a wildcard (*) as the first, last, or only character in the field name. Dynamic field names must either begin or end with a wildcard (*). Multiple wildcards and wildcards embedded within a string are not supported.
A dynamic field's name defines a pattern. The wildcard matches zero or more arbitrary characters. Any unrecognized fields that match that pattern are configured with the dynamic field's indexing options. Regular index fields take precedence over dynamic fields. If a document field name matches both a regular index field and a dynamic field pattern, it is mapped to the regular index field. The options you can configure for dynamic fields are the same as for static fields.
For example, if you establish the naming convention that
_i is appended
to the name of any new
int field, you can define a dynamic field with the
*_i that sets the field type to
int and configures a
set of predefined indexing options for new
int fields. When you add a field
review_rating_i, it's configured according to the
options and indexed automatically.
If a document field matches more than one dynamic field pattern, the longest matching pattern is used. If the patterns are the same length, the dynamic field that occurs first when the field names are sorted alphabetically is used.
You can define * as a dynamic field to match any fields that don't map to an explicitly defined field or a longer dynamic field pattern. This is useful if you want to simply ignore unrecognized fields. For more information, see Ignoring Unrecognized Document Fields.
Dynamic fields count toward the total number of fields defined for a domain. A domain can have a maximum of 1,000 fields, which includes dynamic fields. However, the pattern defined by a single dynamic field typically matches multiple document fields, so the total number of fields in your index can exceed 1,000. When using dynamic fields, keep in mind that significantly increasing the number of fields in your index can impact query performance.
Adding new fields to your domain configuration can affect how fields
that were generated dynamically are validated during indexing. If the validation fails,
indexing will fail. For example, if you define a dynamic field called
and upload documents that contain a field called
added to your index. If you then explicitly configure a field called
rating_new, that new field configuration will be used to validate the
contents of your document's
rating_new field when you run indexing. If
*_new is configured as a
text field and you configure
rating_new as an
int field, validation will fail if the
rating_new fields contain non-integer data.
For more information about configuring index fields, see configure indexing options.
Using a Dynamic Field to Ignore Unrecognized Fields in Amazon CloudSearch
Amazon CloudSearch requires that you configure an index field for every field that occurs in the
documents you are indexing. In some cases, however, you want to index a particular set
of fields and simply ignore everything else. You can use dynamic fields to ignore all
unrecognized fields by defining a literal field called * and disabling all indexing
options for the field. Any unrecognized fields will inherit those options and
be added to your domain; however,
the field contents won't be searchable or
so they'll have minimal impact on the size of your index. (They do, however, count
toward the total number of fields configured for the domain.) Similarly, you can
selectively ignore fields that match a particular pattern, such as
To ignore unrecognized fields
Configure the fields that you want to index, search, or return in the results.
Add a dynamic field that matches any other fields that are found in the documents and disables all indexing options for them:
*as the name of the field, with no prefix or suffix string. (You can also specify a more specific pattern to selectively disable fields.)
Set the field type to
literaland disable the
returnoptions. Note that the maximum size of a literal field is 4096 Unicode code points.
Because longer dynamic field patterns are matched first, you can still use dynamic fields to configure options for fields that you want to use. Any fields that don't map to a regular index field or a longer dynamic field will match the * pattern.
When you create a dynamic field with the name
*, it means that your
index can potentially contain any valid field name. This also means that you can
reference any valid field name in your search requests, whether or not it actually
exists in your index.
Searching Dynamic Fields in Amazon CloudSearch
You can reference dynamically generated fields by name in your search requests and
expressions, just like any other field. For example, to search the dynamically generated
color_t for the color
red, you use the structured query
If you've defined a catch-all dynamic field (*) to map any fields that don't match regular fields or more specific dynamic field patterns, you can specify any valid field name in your search requests, whether or not the field actually exists in your index.
Wildcards are not supported within field names, so you cannot reference the dynamic
field itself. For example, specifying
q=*_t:’red’ would return an
The options a dynamically generated field inherits from the dynamic field configuration control how you can use the field in your search requests, for example, whether you can search it, get facets or highlights, use it for sorting, or return in it results. Note that dynamically generated fields must be searched explicitly—dynamic fields are NOT included in the fields that are searched by default when you use the simple query parser or do not specify a field when searching with the structured query parser.
You can specify dynamic fields as sources for other fields. A field's source attribute
supports wildcards, which enables you to specify a pattern that matches a group of
dynamic fields. For example, to search all fields generated from the
dynamic field, you could create a field called
all_t_fields and set its
source attribute to
*_t. This copies the contents of all fields whose names
all_t_fields. Note, however, that searching
this field will search all fields that match the pattern, not only
dynamically generated fields.
For more information about constructing and submitting search requests, see Searching Your Data with Amazon CloudSearch.