What is Amazon Kinesis Firehose?
Amazon Kinesis Firehose is a fully managed service for delivering real-time streaming data to destinations such as Amazon Simple Storage Service (Amazon S3), Amazon Redshift, or Amazon Elasticsearch Service (Amazon ES). Firehose is part of the Amazon Kinesis streaming data platform, along with Amazon Kinesis Streams and Amazon Kinesis Analytics. With Firehose, you do not need to write applications or manage resources. You configure your data producers to send data to Firehose and it automatically delivers the data to the destination that you specified. You can also configure Firehose to transform your data before data delivery.
As you get started with Firehose, you'll benefit from understanding the following concepts:
- Firehose delivery stream
The underlying entity of Firehose; you use Firehose by creating a Firehose delivery stream and then sending data to it. For more information, see Creating an Amazon Kinesis Firehose Delivery Stream and Sending Data to an Amazon Kinesis Firehose Delivery Stream.
The data of interest that your data producer sends to a Firehose delivery stream. A record can be as large as 1000 KB.
- data producer
Producers send records to Firehose delivery streams. For example, a web server sending log data to a Firehose delivery stream is a data producer. For more information, see Sending Data to an Amazon Kinesis Firehose Delivery Stream.
- buffer size and buffer interval
Firehose buffers incoming streaming data to a certain size or for a certain period of time before delivering to destinations. Buffer Size is in MBs and Buffer Interval is in seconds.
For Amazon S3 destinations, streaming data is delivered to your S3 bucket. If data transformation is enabled, you can optionally back up source data to another Amazon S3 bucket.
For Amazon Redshift destinations, streaming data is delivered to your S3 bucket first. Firehose then issues an Amazon Redshift COPY command to load data from your S3 bucket to your Amazon Redshift cluster. If data transformation is enabled, you can optionally back up source data to another Amazon S3 bucket.
For Amazon ES destinations, streaming data is delivered to your Amazon ES cluster, and can optionally be backed up to your S3 bucket concurrently.