Seleccione sus preferencias de cookies

Usamos cookies esenciales y herramientas similares que son necesarias para proporcionar nuestro sitio y nuestros servicios. Usamos cookies de rendimiento para recopilar estadísticas anónimas para que podamos entender cómo los clientes usan nuestro sitio y hacer mejoras. Las cookies esenciales no se pueden desactivar, pero puede hacer clic en “Personalizar” o “Rechazar” para rechazar las cookies de rendimiento.

Si está de acuerdo, AWS y los terceros aprobados también utilizarán cookies para proporcionar características útiles del sitio, recordar sus preferencias y mostrar contenido relevante, incluida publicidad relevante. Para aceptar o rechazar todas las cookies no esenciales, haga clic en “Aceptar” o “Rechazar”. Para elegir opciones más detalladas, haga clic en “Personalizar”.

Ejemplos de Firehose usando el SDK para Python (Boto3) - AWS Ejemplos de código de SDK

Hay más ejemplos de AWS SDK disponibles en el GitHub repositorio de ejemplos de AWS Doc SDK.

Las traducciones son generadas a través de traducción automática. En caso de conflicto entre la traducción y la version original de inglés, prevalecerá la version en inglés.

Hay más ejemplos de AWS SDK disponibles en el GitHub repositorio de ejemplos de AWS Doc SDK.

Las traducciones son generadas a través de traducción automática. En caso de conflicto entre la traducción y la version original de inglés, prevalecerá la version en inglés.

Ejemplos de Firehose usando el SDK para Python (Boto3)

Los siguientes ejemplos de código muestran cómo realizar acciones e implementar escenarios comunes AWS SDK for Python (Boto3) con Firehose.

Las acciones son extractos de código de programas más grandes y deben ejecutarse en contexto. Mientras las acciones muestran cómo llamar a las distintas funciones de servicio, es posible ver las acciones en contexto en los escenarios relacionados.

Los escenarios son ejemplos de código que muestran cómo llevar a cabo una tarea específica a través de llamadas a varias funciones dentro del servicio o combinado con otros Servicios de AWS.

En cada ejemplo se incluye un enlace al código de origen completo, con instrucciones de configuración y ejecución del código en el contexto.

Acciones

En el siguiente ejemplo de código, se muestra cómo utilizar PutRecord.

SDK para Python (Boto3)
nota

Hay más información al respecto. GitHub Busque el ejemplo completo y aprenda a configurar y ejecutar en el Repositorio de ejemplos de código de AWS.

class FirehoseClient: """ AWS Firehose client to send records and monitor metrics. Attributes: config (object): Configuration object with delivery stream name and region. delivery_stream_name (str): Name of the Firehose delivery stream. region (str): AWS region for Firehose and CloudWatch clients. firehose (boto3.client): Boto3 Firehose client. cloudwatch (boto3.client): Boto3 CloudWatch client. """ def __init__(self, config): """ Initialize the FirehoseClient. Args: config (object): Configuration object with delivery stream name and region. """ self.config = config self.delivery_stream_name = config.delivery_stream_name self.region = config.region self.firehose = boto3.client("firehose", region_name=self.region) self.cloudwatch = boto3.client("cloudwatch", region_name=self.region) @backoff.on_exception( backoff.expo, Exception, max_tries=5, jitter=backoff.full_jitter ) def put_record(self, record: dict): """ Put individual records to Firehose with backoff and retry. Args: record (dict): The data record to be sent to Firehose. This method attempts to send an individual record to the Firehose delivery stream. It retries with exponential backoff in case of exceptions. """ try: entry = self._create_record_entry(record) response = self.firehose.put_record( DeliveryStreamName=self.delivery_stream_name, Record=entry ) self._log_response(response, entry) except Exception: logger.info(f"Fail record: {record}.") raise
  • Para obtener más información sobre la API, consulta PutRecordla AWS Referencia de API de SDK for Python (Boto3).

En el siguiente ejemplo de código, se muestra cómo utilizar PutRecord.

SDK para Python (Boto3)
nota

Hay más información al respecto. GitHub Busque el ejemplo completo y aprenda a configurar y ejecutar en el Repositorio de ejemplos de código de AWS.

class FirehoseClient: """ AWS Firehose client to send records and monitor metrics. Attributes: config (object): Configuration object with delivery stream name and region. delivery_stream_name (str): Name of the Firehose delivery stream. region (str): AWS region for Firehose and CloudWatch clients. firehose (boto3.client): Boto3 Firehose client. cloudwatch (boto3.client): Boto3 CloudWatch client. """ def __init__(self, config): """ Initialize the FirehoseClient. Args: config (object): Configuration object with delivery stream name and region. """ self.config = config self.delivery_stream_name = config.delivery_stream_name self.region = config.region self.firehose = boto3.client("firehose", region_name=self.region) self.cloudwatch = boto3.client("cloudwatch", region_name=self.region) @backoff.on_exception( backoff.expo, Exception, max_tries=5, jitter=backoff.full_jitter ) def put_record(self, record: dict): """ Put individual records to Firehose with backoff and retry. Args: record (dict): The data record to be sent to Firehose. This method attempts to send an individual record to the Firehose delivery stream. It retries with exponential backoff in case of exceptions. """ try: entry = self._create_record_entry(record) response = self.firehose.put_record( DeliveryStreamName=self.delivery_stream_name, Record=entry ) self._log_response(response, entry) except Exception: logger.info(f"Fail record: {record}.") raise
  • Para obtener más información sobre la API, consulta PutRecordla AWS Referencia de API de SDK for Python (Boto3).

En el siguiente ejemplo de código, se muestra cómo utilizar PutRecordBatch.

SDK para Python (Boto3)
nota

Hay más información al respecto. GitHub Busque el ejemplo completo y aprenda a configurar y ejecutar en el Repositorio de ejemplos de código de AWS.

class FirehoseClient: """ AWS Firehose client to send records and monitor metrics. Attributes: config (object): Configuration object with delivery stream name and region. delivery_stream_name (str): Name of the Firehose delivery stream. region (str): AWS region for Firehose and CloudWatch clients. firehose (boto3.client): Boto3 Firehose client. cloudwatch (boto3.client): Boto3 CloudWatch client. """ def __init__(self, config): """ Initialize the FirehoseClient. Args: config (object): Configuration object with delivery stream name and region. """ self.config = config self.delivery_stream_name = config.delivery_stream_name self.region = config.region self.firehose = boto3.client("firehose", region_name=self.region) self.cloudwatch = boto3.client("cloudwatch", region_name=self.region) @backoff.on_exception( backoff.expo, Exception, max_tries=5, jitter=backoff.full_jitter ) def put_record_batch(self, data: list, batch_size: int = 500): """ Put records in batches to Firehose with backoff and retry. Args: data (list): List of data records to be sent to Firehose. batch_size (int): Number of records to send in each batch. Default is 500. This method attempts to send records in batches to the Firehose delivery stream. It retries with exponential backoff in case of exceptions. """ for i in range(0, len(data), batch_size): batch = data[i : i + batch_size] record_dicts = [{"Data": json.dumps(record)} for record in batch] try: response = self.firehose.put_record_batch( DeliveryStreamName=self.delivery_stream_name, Records=record_dicts ) self._log_batch_response(response, len(batch)) except Exception as e: logger.info(f"Failed to send batch of {len(batch)} records. Error: {e}")
  • Para obtener más información sobre la API, consulta PutRecordBatchla AWS Referencia de API de SDK for Python (Boto3).

En el siguiente ejemplo de código, se muestra cómo utilizar PutRecordBatch.

SDK para Python (Boto3)
nota

Hay más información al respecto. GitHub Busque el ejemplo completo y aprenda a configurar y ejecutar en el Repositorio de ejemplos de código de AWS.

class FirehoseClient: """ AWS Firehose client to send records and monitor metrics. Attributes: config (object): Configuration object with delivery stream name and region. delivery_stream_name (str): Name of the Firehose delivery stream. region (str): AWS region for Firehose and CloudWatch clients. firehose (boto3.client): Boto3 Firehose client. cloudwatch (boto3.client): Boto3 CloudWatch client. """ def __init__(self, config): """ Initialize the FirehoseClient. Args: config (object): Configuration object with delivery stream name and region. """ self.config = config self.delivery_stream_name = config.delivery_stream_name self.region = config.region self.firehose = boto3.client("firehose", region_name=self.region) self.cloudwatch = boto3.client("cloudwatch", region_name=self.region) @backoff.on_exception( backoff.expo, Exception, max_tries=5, jitter=backoff.full_jitter ) def put_record_batch(self, data: list, batch_size: int = 500): """ Put records in batches to Firehose with backoff and retry. Args: data (list): List of data records to be sent to Firehose. batch_size (int): Number of records to send in each batch. Default is 500. This method attempts to send records in batches to the Firehose delivery stream. It retries with exponential backoff in case of exceptions. """ for i in range(0, len(data), batch_size): batch = data[i : i + batch_size] record_dicts = [{"Data": json.dumps(record)} for record in batch] try: response = self.firehose.put_record_batch( DeliveryStreamName=self.delivery_stream_name, Records=record_dicts ) self._log_batch_response(response, len(batch)) except Exception as e: logger.info(f"Failed to send batch of {len(batch)} records. Error: {e}")
  • Para obtener más información sobre la API, consulta PutRecordBatchla AWS Referencia de API de SDK for Python (Boto3).

Escenarios

En el siguiente ejemplo de código, se muestra cómo usar Firehose para procesar registros individuales y por lotes.

SDK para Python (Boto3)
nota

Hay más información al respecto. GitHub Busque el ejemplo completo y aprenda a configurar y ejecutar en el Repositorio de ejemplos de código de AWS.

Este script coloca los registros individuales y por lotes en Firehose.

import json import logging import random from datetime import datetime, timedelta import backoff import boto3 from config import get_config def load_sample_data(path: str) -> dict: """ Load sample data from a JSON file. Args: path (str): The file path to the JSON file containing sample data. Returns: dict: The loaded sample data as a dictionary. """ with open(path, "r") as f: return json.load(f) # Configure logging logging.basicConfig(level=logging.INFO) logger = logging.getLogger(__name__) class FirehoseClient: """ AWS Firehose client to send records and monitor metrics. Attributes: config (object): Configuration object with delivery stream name and region. delivery_stream_name (str): Name of the Firehose delivery stream. region (str): AWS region for Firehose and CloudWatch clients. firehose (boto3.client): Boto3 Firehose client. cloudwatch (boto3.client): Boto3 CloudWatch client. """ def __init__(self, config): """ Initialize the FirehoseClient. Args: config (object): Configuration object with delivery stream name and region. """ self.config = config self.delivery_stream_name = config.delivery_stream_name self.region = config.region self.firehose = boto3.client("firehose", region_name=self.region) self.cloudwatch = boto3.client("cloudwatch", region_name=self.region) @backoff.on_exception( backoff.expo, Exception, max_tries=5, jitter=backoff.full_jitter ) def put_record(self, record: dict): """ Put individual records to Firehose with backoff and retry. Args: record (dict): The data record to be sent to Firehose. This method attempts to send an individual record to the Firehose delivery stream. It retries with exponential backoff in case of exceptions. """ try: entry = self._create_record_entry(record) response = self.firehose.put_record( DeliveryStreamName=self.delivery_stream_name, Record=entry ) self._log_response(response, entry) except Exception: logger.info(f"Fail record: {record}.") raise @backoff.on_exception( backoff.expo, Exception, max_tries=5, jitter=backoff.full_jitter ) def put_record_batch(self, data: list, batch_size: int = 500): """ Put records in batches to Firehose with backoff and retry. Args: data (list): List of data records to be sent to Firehose. batch_size (int): Number of records to send in each batch. Default is 500. This method attempts to send records in batches to the Firehose delivery stream. It retries with exponential backoff in case of exceptions. """ for i in range(0, len(data), batch_size): batch = data[i : i + batch_size] record_dicts = [{"Data": json.dumps(record)} for record in batch] try: response = self.firehose.put_record_batch( DeliveryStreamName=self.delivery_stream_name, Records=record_dicts ) self._log_batch_response(response, len(batch)) except Exception as e: logger.info(f"Failed to send batch of {len(batch)} records. Error: {e}") def get_metric_statistics( self, metric_name: str, start_time: datetime, end_time: datetime, period: int, statistics: list = ["Sum"], ) -> list: """ Retrieve metric statistics from CloudWatch. Args: metric_name (str): The name of the metric. start_time (datetime): The start time for the metric statistics. end_time (datetime): The end time for the metric statistics. period (int): The granularity, in seconds, of the returned data points. statistics (list): A list of statistics to retrieve. Default is ['Sum']. Returns: list: List of datapoints containing the metric statistics. """ response = self.cloudwatch.get_metric_statistics( Namespace="AWS/Firehose", MetricName=metric_name, Dimensions=[ {"Name": "DeliveryStreamName", "Value": self.delivery_stream_name}, ], StartTime=start_time, EndTime=end_time, Period=period, Statistics=statistics, ) return response["Datapoints"] def monitor_metrics(self): """ Monitor Firehose metrics for the last 5 minutes. This method retrieves and logs the 'IncomingBytes', 'IncomingRecords', and 'FailedPutCount' metrics from CloudWatch for the last 5 minutes. """ end_time = datetime.utcnow() start_time = end_time - timedelta(minutes=10) period = int((end_time - start_time).total_seconds()) metrics = { "IncomingBytes": self.get_metric_statistics( "IncomingBytes", start_time, end_time, period ), "IncomingRecords": self.get_metric_statistics( "IncomingRecords", start_time, end_time, period ), "FailedPutCount": self.get_metric_statistics( "FailedPutCount", start_time, end_time, period ), } for metric, datapoints in metrics.items(): if datapoints: total_sum = sum(datapoint["Sum"] for datapoint in datapoints) if metric == "IncomingBytes": logger.info( f"{metric}: {round(total_sum)} ({total_sum / (1024 * 1024):.2f} MB)" ) else: logger.info(f"{metric}: {round(total_sum)}") else: logger.info(f"No data found for {metric} over the last 5 minutes") def _create_record_entry(self, record: dict) -> dict: """ Create a record entry for Firehose. Args: record (dict): The data record to be sent. Returns: dict: The record entry formatted for Firehose. Raises: Exception: If a simulated network error occurs. """ if random.random() < 0.2: raise Exception("Simulated network error") elif random.random() < 0.1: return {"Data": '{"malformed": "data"'} else: return {"Data": json.dumps(record)} def _log_response(self, response: dict, entry: dict): """ Log the response from Firehose. Args: response (dict): The response from the Firehose put_record API call. entry (dict): The record entry that was sent. """ if response["ResponseMetadata"]["HTTPStatusCode"] == 200: logger.info(f"Sent record: {entry}") else: logger.info(f"Fail record: {entry}") def _log_batch_response(self, response: dict, batch_size: int): """ Log the batch response from Firehose. Args: response (dict): The response from the Firehose put_record_batch API call. batch_size (int): The number of records in the batch. """ if response.get("FailedPutCount", 0) > 0: logger.info( f'Failed to send {response["FailedPutCount"]} records in batch of {batch_size}' ) else: logger.info(f"Successfully sent batch of {batch_size} records") if __name__ == "__main__": config = get_config() data = load_sample_data(config.sample_data_file) client = FirehoseClient(config) # Process the first 100 sample network records for record in data[:100]: try: client.put_record(record) except Exception as e: logger.info(f"Put record failed after retries and backoff: {e}") client.monitor_metrics() # Process remaining records using the batch method try: client.put_record_batch(data[100:]) except Exception as e: logger.info(f"Put record batch failed after retries and backoff: {e}") client.monitor_metrics()

Este archivo contiene la configuración del script anterior.

class Config: def __init__(self): self.delivery_stream_name = "ENTER YOUR DELIVERY STREAM NAME HERE" self.region = "us-east-1" self.sample_data_file = ( "../../../../../scenarios/features/firehose/resources/sample_records.json" ) def get_config(): return Config()
  • Para obtener información sobre la API, consulte los siguientes temas en la Referencia de la API de AWS SDK para Python (Boto3).

En el siguiente ejemplo de código, se muestra cómo usar Firehose para procesar registros individuales y por lotes.

SDK para Python (Boto3)
nota

Hay más información al respecto. GitHub Busque el ejemplo completo y aprenda a configurar y ejecutar en el Repositorio de ejemplos de código de AWS.

Este script coloca los registros individuales y por lotes en Firehose.

import json import logging import random from datetime import datetime, timedelta import backoff import boto3 from config import get_config def load_sample_data(path: str) -> dict: """ Load sample data from a JSON file. Args: path (str): The file path to the JSON file containing sample data. Returns: dict: The loaded sample data as a dictionary. """ with open(path, "r") as f: return json.load(f) # Configure logging logging.basicConfig(level=logging.INFO) logger = logging.getLogger(__name__) class FirehoseClient: """ AWS Firehose client to send records and monitor metrics. Attributes: config (object): Configuration object with delivery stream name and region. delivery_stream_name (str): Name of the Firehose delivery stream. region (str): AWS region for Firehose and CloudWatch clients. firehose (boto3.client): Boto3 Firehose client. cloudwatch (boto3.client): Boto3 CloudWatch client. """ def __init__(self, config): """ Initialize the FirehoseClient. Args: config (object): Configuration object with delivery stream name and region. """ self.config = config self.delivery_stream_name = config.delivery_stream_name self.region = config.region self.firehose = boto3.client("firehose", region_name=self.region) self.cloudwatch = boto3.client("cloudwatch", region_name=self.region) @backoff.on_exception( backoff.expo, Exception, max_tries=5, jitter=backoff.full_jitter ) def put_record(self, record: dict): """ Put individual records to Firehose with backoff and retry. Args: record (dict): The data record to be sent to Firehose. This method attempts to send an individual record to the Firehose delivery stream. It retries with exponential backoff in case of exceptions. """ try: entry = self._create_record_entry(record) response = self.firehose.put_record( DeliveryStreamName=self.delivery_stream_name, Record=entry ) self._log_response(response, entry) except Exception: logger.info(f"Fail record: {record}.") raise @backoff.on_exception( backoff.expo, Exception, max_tries=5, jitter=backoff.full_jitter ) def put_record_batch(self, data: list, batch_size: int = 500): """ Put records in batches to Firehose with backoff and retry. Args: data (list): List of data records to be sent to Firehose. batch_size (int): Number of records to send in each batch. Default is 500. This method attempts to send records in batches to the Firehose delivery stream. It retries with exponential backoff in case of exceptions. """ for i in range(0, len(data), batch_size): batch = data[i : i + batch_size] record_dicts = [{"Data": json.dumps(record)} for record in batch] try: response = self.firehose.put_record_batch( DeliveryStreamName=self.delivery_stream_name, Records=record_dicts ) self._log_batch_response(response, len(batch)) except Exception as e: logger.info(f"Failed to send batch of {len(batch)} records. Error: {e}") def get_metric_statistics( self, metric_name: str, start_time: datetime, end_time: datetime, period: int, statistics: list = ["Sum"], ) -> list: """ Retrieve metric statistics from CloudWatch. Args: metric_name (str): The name of the metric. start_time (datetime): The start time for the metric statistics. end_time (datetime): The end time for the metric statistics. period (int): The granularity, in seconds, of the returned data points. statistics (list): A list of statistics to retrieve. Default is ['Sum']. Returns: list: List of datapoints containing the metric statistics. """ response = self.cloudwatch.get_metric_statistics( Namespace="AWS/Firehose", MetricName=metric_name, Dimensions=[ {"Name": "DeliveryStreamName", "Value": self.delivery_stream_name}, ], StartTime=start_time, EndTime=end_time, Period=period, Statistics=statistics, ) return response["Datapoints"] def monitor_metrics(self): """ Monitor Firehose metrics for the last 5 minutes. This method retrieves and logs the 'IncomingBytes', 'IncomingRecords', and 'FailedPutCount' metrics from CloudWatch for the last 5 minutes. """ end_time = datetime.utcnow() start_time = end_time - timedelta(minutes=10) period = int((end_time - start_time).total_seconds()) metrics = { "IncomingBytes": self.get_metric_statistics( "IncomingBytes", start_time, end_time, period ), "IncomingRecords": self.get_metric_statistics( "IncomingRecords", start_time, end_time, period ), "FailedPutCount": self.get_metric_statistics( "FailedPutCount", start_time, end_time, period ), } for metric, datapoints in metrics.items(): if datapoints: total_sum = sum(datapoint["Sum"] for datapoint in datapoints) if metric == "IncomingBytes": logger.info( f"{metric}: {round(total_sum)} ({total_sum / (1024 * 1024):.2f} MB)" ) else: logger.info(f"{metric}: {round(total_sum)}") else: logger.info(f"No data found for {metric} over the last 5 minutes") def _create_record_entry(self, record: dict) -> dict: """ Create a record entry for Firehose. Args: record (dict): The data record to be sent. Returns: dict: The record entry formatted for Firehose. Raises: Exception: If a simulated network error occurs. """ if random.random() < 0.2: raise Exception("Simulated network error") elif random.random() < 0.1: return {"Data": '{"malformed": "data"'} else: return {"Data": json.dumps(record)} def _log_response(self, response: dict, entry: dict): """ Log the response from Firehose. Args: response (dict): The response from the Firehose put_record API call. entry (dict): The record entry that was sent. """ if response["ResponseMetadata"]["HTTPStatusCode"] == 200: logger.info(f"Sent record: {entry}") else: logger.info(f"Fail record: {entry}") def _log_batch_response(self, response: dict, batch_size: int): """ Log the batch response from Firehose. Args: response (dict): The response from the Firehose put_record_batch API call. batch_size (int): The number of records in the batch. """ if response.get("FailedPutCount", 0) > 0: logger.info( f'Failed to send {response["FailedPutCount"]} records in batch of {batch_size}' ) else: logger.info(f"Successfully sent batch of {batch_size} records") if __name__ == "__main__": config = get_config() data = load_sample_data(config.sample_data_file) client = FirehoseClient(config) # Process the first 100 sample network records for record in data[:100]: try: client.put_record(record) except Exception as e: logger.info(f"Put record failed after retries and backoff: {e}") client.monitor_metrics() # Process remaining records using the batch method try: client.put_record_batch(data[100:]) except Exception as e: logger.info(f"Put record batch failed after retries and backoff: {e}") client.monitor_metrics()

Este archivo contiene la configuración del script anterior.

class Config: def __init__(self): self.delivery_stream_name = "ENTER YOUR DELIVERY STREAM NAME HERE" self.region = "us-east-1" self.sample_data_file = ( "../../../../../scenarios/features/firehose/resources/sample_records.json" ) def get_config(): return Config()
  • Para obtener información sobre la API, consulte los siguientes temas en la Referencia de la API de AWS SDK para Python (Boto3).

PrivacidadTérminos del sitioPreferencias de cookies
© 2025, Amazon Web Services, Inc o sus afiliados. Todos los derechos reservados.