Deleting a data source - Amazon Kendra

Deleting a data source

You delete a data source when you want to remove the information contained in the data source from your Amazon Kendra index. For example, delete a data source when:

  • A data source is incorrectly configured. Delete the data source, wait for the data source to finish deleting, and then recreate it.

  • You migrated documents from one data source to another. Delete the original data source and recreate it in the new location.

  • You have reached the limit of data sources for an index. Delete one of the existing data sources and add a new one. For more information about the number of data sources that you can create, see Quotas.

To delete a data source, use the console, the AWS Command Line Interface (AWS CLI), the DeleteDataSource API, or a AWS CloudFormation script. Deleting a data source removes all of the information about the data source from the index. If you only want to stop synching the data source, change the synchronization schedule for the data source to "run on demand".

Deleting a data source is an asynchronous operation. When you start deleting a data source, the data source status changes to DELETING. It remains in the DELETING state until the information related to the data source is removed. After the data source is deleted, it no longer appears in the results of a call to the ListDataSources API. If you call the DescribeDataSource API with the deleted data source's identifier, you receive a ResourceNotFound exception.

Note

Deleting an entire data source or re-syncing your index after deleting specific documents from a data source could take up to an hour or more, depending on the number of documents you want to delete.

To delete a data source (console)
  1. Sign in to the AWS Management Console and open the Amazon Kendra console at https://console.aws.amazon.com/kendra/.

  2. In the navigation pane, choose Indexes, and then choose the index that contains the data source to delete.

  3. In the navigation pane, choose Data sources.

  4. Choose the data source to remove.

  5. Choose Delete to delete the data source.

To delete a data source (CLI)
  • In the AWS Command Line Interface, use the following command. The command is formatted for Linux and macOS. If you are using Windows, replace the Unix line continuation character (\) with a caret (^).

    aws kendra delete-data-source \ --id data-source-id \ --index-id index-id

When you delete a data source, Amazon Kendra removes all of the stored information about the data source. Amazon Kendra removes all of the document data stored in the index, and all run histories and metrics associated with the data source. Deleting a data source does not remove the original documents from your storage.

Documents in the data source may be included in the document count returned by the DescribeIndex API while Amazon Kendra deletes a data source. Documents from the data source may appear in search results while Amazon Kendra deletes the data source.

Amazon Kendra releases the resources for a data source as soon as you call the DeleteDataSource API or choose to delete the data source in the console. If you are deleting the data source to reduce the number of data sources below your limit, you can create a new data source right away.

If you are deleting a data source and then creating another data source to the document data, wait for the first data source to be deleted before you sync the new data source.

You can delete a data source that is in the process of syncing with Amazon Kendra. The sync is stopped and the data source is removed. If you attempt to start a sync when the data source is being deleted, you get a ConflictException exception.

You can't delete a data source if the associated index is in the DELETING state. Deleting an index deletes all of the data sources for the index. You can start deleting an index while a data source for that index is in the DELETING state.

If you have two data sources pointing to the same documents, such as two data sources pointing to the same Amazon S3 bucket, documents in the index might be inconsistent when one of the data sources is deleted. When two data sources reference the same documents, only one copy of the document data is stored in the index. Removing one data source removes the index data for the documents. The other data source is not aware that the documents have been removed, so Amazon Kendra won't correctly re-index the documents the next time it syncs. When you have two data sources pointing to the same document location, you should delete both data sources and then recreate one.