Amazon Elasticsearch Service
Developer Guide (API Version 2015-01-01)

Searching Data in Amazon Elasticsearch Service

As you might expect from a search engine, Elasticsearch offers numerous options for searching your data. This chapter introduces a few common ways of performing searches with Amazon ES. You can use Postman to test the various requests. For code samples that send signed HTTP requests to Amazon ES, see Signing HTTP Requests to Amazon Elasticsearch Service.

Note

All example requests in this chapter work with the Elasticsearch 6.x APIs. Some requests might not work with older Elasticsearch versions.

URI Searches

Universal Resource Identifier (URI) searches are the simplest form of search. In a URI search, you specify the query as an HTTP request parameter:

GET https://search-my-domain.us-west-1.es.amazonaws.com/_search?q=house

A sample response might look like the following:

{ "took": 25, "timed_out": false, "_shards": { "total": 10, "successful": 10, "skipped": 0, "failed": 0 }, "hits": { "total": 85, "max_score": 6.6137657, "hits": [ { "_index": "movies", "_type": "movie", "_id": "tt0077975", "_score": 6.6137657, "_source": { "directors": [ "John Landis" ], "release_date": "1978-07-27T00:00:00Z", "rating": 7.5, "genres": [ "Comedy", "Romance" ], "image_url": "http://ia.media-imdb.com/images/M/MV5BMTY2OTQxNTc1OF5BMl5BanBnXkFtZTYwNjA3NjI5._V1_SX400_.jpg", "plot": "At a 1962 College, Dean Vernon Wormer is determined to expel the entire Delta Tau Chi Fraternity, but those troublemakers have other plans for him.", "title": "Animal House", "rank": 527, "running_time_secs": 6540, "actors": [ "John Belushi", "Karen Allen", "Tom Hulce" ], "year": 1978, "id": "tt0077975" } }, ... ] } }

By default, this query searches all fields of all indices for the term house. To narrow the search, specify an index (movies) and a document field (title) in the URI:

GET https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search?q=title:house

You can include additional parameters in the request, but the supported parameters provide only a small subset of the Elasticsearch search options. The following request returns 20 results (instead of the default of 10) and sorts by year (rather than by _score):

GET https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search?q=title:house&size=20&sort=year:desc

Request Body Searches

To perform more complex searches, use the HTTP request body and the Elasticsearch domain-specific language (DSL) for queries. The query DSL lets you specify the full range of Elasticsearch search options. The following match query is similar to the final URI search example:

POST https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search { "size": 20, "sort": { "year": { "order": "desc" } }, "query": { "query_string": { "default_field": "title", "query": "house" } } }

Note

The _search API accepts HTTP GET and POST for request body searches, but not all HTTP clients support adding a request body to a GET request. POST is the more universal choice.

In many cases, you might want to search several fields, but not all fields. Use the multi_match query:

POST https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search { "size": 20, "query": { "multi_match": { "query": "house", "fields": ["title", "plot", "actors", "directors"] } } }

Boosting Fields

You can improve search relevancy by "boosting" certain fields. Boosts are multipliers that weigh matches in one field more heavily than matches in other fields. In the following example, a match for john in the title field influences _score twice as much as a match in the plot field and four times as much as a match in the actors or directors fields. The result is that films like John Wick and John Carter are near the top of the search results, and films starring John Travolta are near the bottom.

POST https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search { "size": 20, "query": { "multi_match": { "query": "john", "fields": ["title^4", "plot^2", "actors", "directors"] } } }

Paginating Search Results

If you need to display a large number of search results, you can implement pagination using the from parameter. The following request returns results 20–39 of the zero-indexed list of search results:

POST https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search { "from": 20, "size": 20, "query": { "multi_match": { "query": "house", "fields": ["title^4", "plot^2", "actors", "directors"] } } }

Search Result Highlighting

The highlight option tells Elasticsearch to return an additional object inside of the hits array if the query matched one or more fields:

POST https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search { "size": 20, "query": { "multi_match": { "query": "house", "fields": ["title^4", "plot^2", "actors", "directors"] } }, "highlight": { "fields": { "plot": {} } } }

If the query matched the content of the plot field, a hit might look like the following:

{ "_index": "movies", "_type": "movie", "_id": "tt0091541", "_score": 11.276199, "_source": { "directors": [ "Richard Benjamin" ], "release_date": "1986-03-26T00:00:00Z", "rating": 6, "genres": [ "Comedy", "Music" ], "image_url": "http://ia.media-imdb.com/images/M/MV5BMTIzODEzODE2OF5BMl5BanBnXkFtZTcwNjQ3ODcyMQ@@._V1_SX400_.jpg", "plot": "A young couple struggles to repair a hopelessly dilapidated house.", "title": "The Money Pit", "rank": 4095, "running_time_secs": 5460, "actors": [ "Tom Hanks", "Shelley Long", "Alexander Godunov" ], "year": 1986, "id": "tt0091541" }, "highlight": { "plot": [ "A young couple struggles to repair a hopelessly dilapidated <em>house</em>." ] } }

By default, Elasticsearch wraps the matching string in <em> tags, provides up to 100 characters of context around the match, and breaks content into sentences by identifying punctuation marks, spaces, tabs, and line breaks. All of these settings are customizable:

POST https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search { "size": 20, "query": { "multi_match": { "query": "house", "fields": ["title^4", "plot^2", "actors", "directors"] } }, "highlight": { "fields": { "plot": {} }, "pre_tags": "<strong>", "post_tags": "</strong>", "fragment_size": 200, "boundary_chars": ".,!? " } }

Count API

If you're not interested in the contents of your documents and just want to know the number of matches, you can use the _count API instead of the _search API. The following request uses the query_string query to identify romantic comedies:

POST https://search-my-domain.us-west-1.es.amazonaws.com/movies/_count { "query": { "query_string": { "default_field": "genres", "query": "romance AND comedy" } } }

A sample response might look like the following:

{ "count": 564, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 } }