|« PreviousNext »|
|Did this page help you? Yes | No | Tell us about it...|
To make your data searchable, you must describe it according to the Search Data Format (SDF) and upload the resulting SDF batches to a search domain. Amazon CloudSearch can then generate a search index from your SDF data according to the index fields and text options that you have configured for the domain. As your data changes, you submit SDF updates to add, change, or delete documents from your index. Amazon CloudSearch applies data updates continuously, so your changes become searchable in near real-time.
Amazon CloudSearch ensures that the most recent changes are applied to your domain using the document version numbers specified in the SDF add and delete operations. The operation with the greatest version number always takes precedence. To be applied, the version number in the add or delete operation must be greater than the document's current version number in the index. If the version number in an add or delete operation is less than the document's current version number, the operation is ignored. If an operation specifies the same document version that already exists in the index, the result is undefined—there's no guarantee which one will take precedence.
To successfully upload SDF data to your domain, it has to be valid JSON or XML and conform to the SDF data conventions. For information about creating SDF batches, see Preparing Your Data for Amazon CloudSearch.
For information about configuring index fields for a domain, see Configuring Index Fields for an Amazon CloudSearch Domain.
You are billed for the total number of document batches uploaded to your search domain, including batches that contain delete operations. For more information about Amazon CloudSearch pricing, see aws.amazon.com/cloudsearch/pricing/.
You use the
cs-post-sdf command to send SDF data to your search domain. The SDF batches can be local or stored in Amazon S3.
For information about installing and setting up the Amazon CloudSearch command line tools, see Amazon CloudSearch Command Line Tool Reference.
To send data to a domain for indexing
If you haven't already, prepare your data according to the SDF schema. For more information about generating SDF, see Preparing Your Data for Amazon CloudSearch.
cs-post-sdf command to upload your SDF data to your domain. You must specify at least one
--source option to specify the location of the SDF data you want to upload.
cs-post-sdf -d mydomain --source data1.sdf Processing: data1.sdf Detected source format for data1.sdf as json Status: success Added: 5208 Deleted: 0
In the Amazon CloudSearch console, you can upload data to your domain from the domain dashboard. The console can automatically convert the following types of files to SDF during the upload process:
Comma Separated Value (.csv)
Adobe Portable Document Format (.pdf)
HTML (.htm, .html)
Microsoft Excel (.xls, .xlsx)
Microsoft PowerPoint (.ppt, .pptx)
Microsoft Word (.doc, .docx)
Text Documents (.txt)
JSON Documents (.json)
XML Documents (.xml)
CSV files are parsed row-by-row and a separate document is generated for each row. All other types of files are treated as a single document. For more information about automatically generating SDF, see Preparing Your Data for Amazon CloudSearch.
You can also upload SDF batches through the Amazon CloudSearch console.
To send data to a domain for indexing
Go to the Amazon CloudSearch console at https://console.aws.amazon.com/cloudsearch/home.
In the Navigation panel, click the name of the domain.
At the top of the domain dashboard, click Upload Documents.
Select the location of the data you want to upload to your domain:
File(s) on my local disk
Object(s) from Amazon S3
If you upload data in a format other than SDF, it will automatically be converted to SDF during the upload process.
If you are uploading local files, click Browse to choose the file(s) to upload:
If you are uploading objects from Amazon S3, select the bucket you want to upload from. To upload the entire contents of the bucket, leave the Prefix field empty and click Add. To upload selected objects, enter a filter in the Prefix field and click Add. (You can add multiple prefixes.)
If are uploading predefined sample data, choose the data set that you want to use:
Once you've selected the data you want to upload, click Continue.
On the Review Documents step, review the documents to be uploaded and click Upload Documents to continue.
On the Document Summary step, if SDF batches have been automatically generated from your data, you can click Download the generated SDF files to get them. Click Finish to return to the domain dashboard.
You use the
documents/batch API to post SDF data to your domain to add, update, or remove documents. For example:
curl -X POST --upload-file data1.sdf doc.movies-123456789012.us-east-1.cloudsearch.amazonaws.com/2011-02-01/documents/batch --header "Content-Type:application/json"