Menu
Amazon CloudSearch
Developer Guide (API Version 2013-01-01)

Step 2: Upload Data to Amazon CloudSearch for Indexing

You upload the data you want to search to your domain so that Amazon CloudSearch can build and deploy a searchable index. To be indexed by Amazon CloudSearch, the data must be formatted in either JSON or XML. The Amazon CloudSearch console can automatically convert the following file types to the required JSON or XML format:

  • Comma Separated Value (.csv)

  • Adobe Portable Document Format (.pdf)

  • HTML (.htm, .html)

  • Microsoft Excel (.xls, .xlsx)

  • Microsoft PowerPoint (.ppt, .pptx)

  • Microsoft Word (.doc, .docx)

  • Text Documents (.txt)

When you upload a CSV file, Amazon CloudSearch parses each row separately. The first row defines the document fields, and each subsequent row becomes a separate document. For all other file types Amazon CloudSearch creates a single document and the contents of the file are mapped to a single text field. If metadata is available for the file, the metadata is mapped to corresponding document fields—the fields generated from the document metadata vary depending on the file type.

The sample IMDB movies data is already formatted in JSON.

This tutorial shows how to submit data through the Amazon CloudSearch console, but you can also convert and upload documents with the command line tools, and upload documents using the documents/batch resource. (To upload more than 5 MB of data, you must use the command line tools or API.)

To upload the sample data to your movies domain

  1. Go to the Amazon CloudSearch console at https://console.aws.amazon.com/cloudsearch/home.

  2. In the Navigation panel, click the name of your movies domain to view the domain dashboard.

  3. At the top of the domain dashboard, click the Upload Documents button.

    Note

    The Upload Documents button is available once the domain status is ACTIVE.

  4. On the DOCUMENT SOURCE step, select Predefined data, choose IMDB movies (demo), and click Continue.

  5. On the REVIEW DOCUMENTS step, review the upload summary and click Upload Documents to send the data to your domain for indexing.

    Note

    If you'd like to see how the data is formatted, click Download the generated document batch. For more information about preparing your own data, see Preparing Your Data.

  6. On the DOCUMENT SUMMARY step, click Finish to return to the domain dashboard.

That's it! You now have a fully functional Amazon CloudSearch domain that you can start searching. Updates are applied continuously in the order they are received, so you can start searching your domain right away.