Menu
Amazon Elasticsearch Service
Developer Guide (API Version 2015-01-01)

Programmatic Indexing

This section includes examples of how to use Elasticsearch clients and standard HTTP requests to index documents.

Python

You can install elasticsearch-py, the Elasticsearch client for Python, using pip. Instead of the client, you might prefer requests. The requests-aws4auth and SDK for Python (Boto 3) packages simplify the authentication process, but are not strictly required. From the terminal, run the following commands:

pip install boto3 pip install elasticsearch pip install requests pip install requests-aws4auth

The following sample code establishes a secure connection to the specified Amazon ES domain and indexes a single document using the _index API. You must provide values for region and host:

from elasticsearch import Elasticsearch, RequestsHttpConnection from requests_aws4auth import AWS4Auth import boto3 host = '' # For example, my-test-domain.us-east-1.es.amazonaws.com region = '' # e.g. us-west-1 service = 'es' credentials = boto3.Session().get_credentials() awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service) es = Elasticsearch( hosts = [{'host': host, 'port': 443}], http_auth = awsauth, use_ssl = True, verify_certs = True, connection_class = RequestsHttpConnection ) document = { "title": "Moneyball", "director": "Bennett Miller", "year": "2011" } es.index(index="movies", doc_type="movie", id="5", body=document) print(es.get(index="movies", doc_type="movie", id="5"))

If you don't want to use the elasticsearch.py client, you can just make standard HTTP requests. This sample creates a new index with seven shards and two replicas:

from requests_aws4auth import AWS4Auth import boto3 import requests host = '' # The domain with https:// and trailing slash. For example, https://my-test-domain.us-east-1.es.amazonaws.com/ path = 'my-index' # the Elasticsearch API endpoint region = '' # For example, us-west-1 service = 'es' credentials = boto3.Session().get_credentials() awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service) url = host + path # The JSON body to accompany the request (if necessary) payload = { "settings" : { "number_of_shards" : 7, "number_of_replicas" : 2 } } r = requests.put(url, auth=awsauth, json=payload) # requests.get, post, and delete have similar syntax print(r.text)

This next example uses the Beautiful Soup library to help build a bulk file from a local directory of HTML files. Using the same client as the first example, you can send the file to the _bulk API for indexing. You could use this code as the basis for adding search functionality to a website:

from bs4 import BeautifulSoup from elasticsearch import Elasticsearch, RequestsHttpConnection from requests_aws4auth import AWS4Auth import boto3 import glob import json bulk_file = '' id = 1 # This loop iterates through all HTML files in the current directory and # indexes two things: the contents of the first h1 tag and all other text. for html_file in glob.glob('*.htm'): with open(html_file) as f: soup = BeautifulSoup(f, 'html.parser') title = soup.h1.string body = soup.get_text(" ", strip=True) # If get_text() is too noisy, you can do further processing on the string. index = { 'title': title, 'body': body, 'link': html_file } # If running this script on a website, you probably need to prepend the URL and path to html_file. # The action_and_metadata portion of the bulk file bulk_file += '{ "index" : { "_index" : "site", "_type" : "page", "_id" : "' + str(id) + '" } }\n' # The optional_document portion of the bulk file bulk_file += json.dumps(index) + '\n' id += 1 host = '' # For example, my-test-domain.us-east-1.es.amazonaws.com region = '' # e.g. us-west-1 service = 'es' credentials = boto3.Session().get_credentials() awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service) es = Elasticsearch( hosts = [{'host': host, 'port': 443}], http_auth = awsauth, use_ssl = True, verify_certs = True, connection_class = RequestsHttpConnection ) es.bulk(bulk_file) print(es.search(q='some test query'))

Java

This first example uses the Elasticsearch low-level Java REST Client, which you must configure as a dependency. The request is unauthenticated and relies on an IP-based access policy. You must provide a value for host:

import java.io.IOException; import java.util.Collections; import org.elasticsearch.client.Response; import org.elasticsearch.client.RestClient; import org.elasticsearch.client.http.entity.ContentType; import org.elasticsearch.client.http.HttpEntity; import org.elasticsearch.client.http.HttpHost; import org.elasticsearch.client.http.nio.entity.NStringEntity; public class JavaRestClientExample { public static void main(String[] args) throws IOException { String host = ""; // For example, my-test-domain.us-east-1.es.amazonaws.com String index = "movies"; String type = "movie"; String id = "6"; String json = "{" + "\"title\":\"Walk the Line\"," + "\"director\":\"James Mangold\"," + "\"year\":\"2005\"" + "}"; RestClient client = RestClient.builder(new HttpHost(host, 443, "https")).build(); HttpEntity entity = new NStringEntity(json, ContentType.APPLICATION_JSON); Response response = client.performRequest("PUT", "/" + index + "/" + type + "/" + id, Collections.<String, String>emptyMap(), entity); System.out.println(response.toString()); } }

The easiest way of sending a signed request is to use the AWS Request Signing Interceptor. The repository contains some samples to help you get started. The following example uses the Elasticsearch low-level Java REST client to perform two unrelated actions: registering a snapshot repository and indexing a document.

import org.apache.http.HttpEntity; import org.apache.http.HttpHost; import org.apache.http.HttpRequestInterceptor; import org.apache.http.entity.ContentType; import org.apache.http.nio.entity.NStringEntity; import org.elasticsearch.client.Response; import org.elasticsearch.client.RestClient; import com.amazonaws.auth.AWS4Signer; import com.amazonaws.auth.AWSCredentialsProvider; import com.amazonaws.auth.DefaultAWSCredentialsProviderChain; import com.amazonaws.http.AWSRequestSigningApacheInterceptor; import java.io.IOException; import java.util.Collections; import java.util.Map; public class AmazonElasticsearchServiceSample { private static String serviceName = "es"; private static String region = "us-west-1"; private static String aesEndpoint = "https://domain.us-west-1.es.amazonaws.com"; private static String payload = "{ \"type\": \"s3\", \"settings\": { \"bucket\": \"your-bucket\", \"region\": \"us-west-1\", \"role_arn\": \"arn:aws:iam::123456789012:role/TheServiceRole\" } }"; private static String snapshotPath = "/_snapshot/my-snapshot-repo"; private static String sampleDocument = "{" + "\"title\":\"Walk the Line\"," + "\"director\":\"James Mangold\"," + "\"year\":\"2005\"}"; private static String indexingPath = "/my-index/my-type"; static final AWSCredentialsProvider credentialsProvider = new DefaultAWSCredentialsProviderChain(); public static void main(String[] args) throws IOException { RestClient esClient = esClient(serviceName, region); // Register a snapshot repository HttpEntity entity = new NStringEntity(payload, ContentType.APPLICATION_JSON); Map<String, String> params = Collections.emptyMap(); Response response = esClient.performRequest("PUT", snapshotPath, params, entity); System.out.println(response.toString()); // Index a document entity = new NStringEntity(sampleDocument, ContentType.APPLICATION_JSON); String id = "1"; response = esClient.performRequest("PUT", indexingPath + "/" + id, params, entity); System.out.println(response.toString()); } // Adds the interceptor to the ES REST client public static RestClient esClient(String serviceName, String region) { AWS4Signer signer = new AWS4Signer(); signer.setServiceName(serviceName); signer.setRegionName(region); HttpRequestInterceptor interceptor = new AWSRequestSigningApacheInterceptor(serviceName, signer, credentialsProvider); return RestClient.builder(HttpHost.create(aesEndpoint)).setHttpClientConfigCallback(hacb -> hacb.addInterceptorLast(interceptor)).build(); } }

If you prefer the high-level REST client, try the following code, which also uses the AWS Request Signing Interceptor:

import org.apache.http.HttpHost; import org.apache.http.HttpRequestInterceptor; import org.elasticsearch.action.index.IndexRequest; import org.elasticsearch.action.index.IndexResponse; import org.elasticsearch.client.RestClient; import org.elasticsearch.client.RestHighLevelClient; import com.amazonaws.auth.AWS4Signer; import com.amazonaws.auth.AWSCredentialsProvider; import com.amazonaws.auth.DefaultAWSCredentialsProviderChain; import com.amazonaws.http.AWSRequestSigningApacheInterceptor; import java.io.IOException; import java.util.HashMap; import java.util.Map; public class AmazonElasticsearchServiceSample { private static String serviceName = "es"; private static String region = "us-west-1"; private static String aesEndpoint = ""; // e.g. https://search-mydomain.us-west-1.es.amazonaws.com private static String index = "my-index"; private static String type = "my-type"; private static String id = "1"; static final AWSCredentialsProvider credentialsProvider = new DefaultAWSCredentialsProviderChain(); public static void main(String[] args) throws IOException { RestHighLevelClient esClient = esClient(serviceName, region); // Create the document as a hash map Map<String, Object> document = new HashMap<>(); document.put("title", "Walk the Line"); document.put("director", "James Mangold"); document.put("year", "2005"); // Form the indexing request, send it, and print the response IndexRequest request = new IndexRequest(index, type, id).source(document); IndexResponse response = esClient.index(request); System.out.println(response.toString()); } // Adds the interceptor to the ES REST client public static RestHighLevelClient esClient(String serviceName, String region) { AWS4Signer signer = new AWS4Signer(); signer.setServiceName(serviceName); signer.setRegionName(region); HttpRequestInterceptor interceptor = new AWSRequestSigningApacheInterceptor(serviceName, signer, credentialsProvider); return new RestHighLevelClient(RestClient.builder(HttpHost.create(aesEndpoint)).setHttpClientConfigCallback(hacb -> hacb.addInterceptorLast(interceptor))); } }

Ruby

This first example uses the Elasticsearch Ruby client and Faraday middleware to perform the request signing. From the terminal, run the following commands:

gem install elasticsearch gem install faraday_middleware-aws-sigv4

This sample code creates a new Elasticsearch client, configures Faraday middleware to sign requests, and indexes a single document. You must provide values for host and region.

require 'elasticsearch' require 'faraday_middleware/aws_sigv4' host = '' # e.g. https://my-domain.region.es.com index = 'ruby-index' type = 'ruby-type' id = '1' document = { year: 2007, title: '5 Centimeters per Second', info: { plot: 'Told in three interconnected segments, we follow a young man named Takaki through his life.', rating: 7.7 } } region = '' # e.g. us-west-1 service = 'es' client = Elasticsearch::Client.new(url: host) do |f| f.request :aws_sigv4, service: service, region: region, access_key_id: ENV['AWS_ACCESS_KEY_ID'], secret_access_key: ENV['AWS_SECRET_ACCESS_KEY'], session_token: ENV['AWS_SESSION_TOKEN'] # optional end puts client.index index: index, type: type, id: id, body: document

If your credentials don't work, export them at the terminal using the following commands:

export AWS_ACCESS_KEY_ID="your-access-key" export AWS_SECRET_ACCESS_KEY="your-secret-key" export AWS_SESSION_TOKEN=""your-session-token"

This next example uses the AWS SDK for Ruby and standard Ruby libraries to send a signed HTTP request. Like the first example, it indexes a single document. You must provide values for host and region.

require 'aws-sdk-elasticsearchservice' host = '' # e.g. https://my-domain.region.es.com index = 'ruby-index' type = 'ruby-type' id = '2' document = { year: 2007, title: '5 Centimeters per Second', info: { plot: 'Told in three interconnected segments, we follow a young man named Takaki through his life.', rating: 7.7 } } service = 'es' region = '' # e.g. us-west-1 signer = Aws::Sigv4::Signer.new( service: service, region: region, access_key_id: ENV['AWS_ACCESS_KEY_ID'], secret_access_key: ENV['AWS_SECRET_ACCESS_KEY'], session_token: ENV['AWS_SESSION_TOKEN'] ) signature = signer.sign_request( http_method: 'PUT', url: host + '/' + index + '/' + type + '/' + id, body: document.to_json ) uri = URI(host + '/' + index + '/' + type + '/' + id) Net::HTTP.start(uri.host, uri.port, :use_ssl => true) do |http| request = Net::HTTP::Put.new uri request.body = document.to_json request['Host'] = signature.headers['host'] request['X-Amz-Date'] = signature.headers['x-amz-date'] request['X-Amz-Security-Token'] = signature.headers['x-amz-security-token'] request['X-Amz-Content-Sha256']= signature.headers['x-amz-content-sha256'] request['Authorization'] = signature.headers['authorization'] request['Content-Type'] = 'application/json' response = http.request request puts response.body end

Node

This example uses the SDK for JavaScript in Node.js. From the terminal, run the following commands:

npm install aws-sdk

This sample code indexes a single document. You must provide values for region and domain.

var AWS = require('aws-sdk'); var region = ''; // e.g. us-west-1 var domain = ''; // e.g. search-domain.region.es.amazonaws.com var index = 'node-test'; var type = 'node-type'; var id = '1'; var json = { "title": "Moneyball", "director": "Bennett Miller", "year": "2011" } indexDocument(json); function indexDocument(document) { var endpoint = new AWS.Endpoint(domain); var request = new AWS.HttpRequest(endpoint, region); request.method = 'PUT'; request.path += index + '/' + type + '/' + id; request.body = JSON.stringify(document); request.headers['host'] = domain; request.headers['Content-Type'] = 'application/json'; var credentials = new AWS.EnvironmentCredentials('AWS'); var signer = new AWS.Signers.V4(request, 'es'); signer.addAuthorization(credentials, new Date()); var client = new AWS.HttpClient(); client.handleRequest(request, null, function(response) { console.log(response.statusCode + ' ' + response.statusMessage); var responseBody = ''; response.on('data', function (chunk) { responseBody += chunk; }); response.on('end', function (chunk) { console.log('Response body: ' + responseBody); }); }, function(error) { console.log('Error: ' + error); }); }

If your credentials don't work, export them at the terminal using the following commands:

export AWS_ACCESS_KEY_ID="your-access-key" export AWS_SECRET_ACCESS_KEY="your-secret-key" export AWS_SESSION_TOKEN=""your-session-token"

On this page: