Avvia l'analisi asincrona di un documento utilizzando Amazon Textract e unAWSSDK - Amazon Textract

Le traduzioni sono generate tramite traduzione automatica. In caso di conflitto tra il contenuto di una traduzione e la versione originale in Inglese, quest'ultima prevarrà.

Avvia l'analisi asincrona di un documento utilizzando Amazon Textract e unAWSSDK

I seguenti esempi di codice mostrano come avviare l'analisi asincrona di un documento utilizzando Amazon Textract.

Java
SDK per Java 2.x

public static String startDocAnalysisS3 (TextractClient textractClient, String bucketName, String docName) { try { List<FeatureType> myList = new ArrayList<FeatureType>(); myList.add(FeatureType.TABLES); myList.add(FeatureType.FORMS); S3Object s3Object = S3Object.builder() .bucket(bucketName) .name(docName) .build(); DocumentLocation location = DocumentLocation.builder() .s3Object(s3Object) .build(); StartDocumentAnalysisRequest documentAnalysisRequest = StartDocumentAnalysisRequest.builder() .documentLocation(location) .featureTypes(myList) .build(); StartDocumentAnalysisResponse response = textractClient.startDocumentAnalysis(documentAnalysisRequest); // Get the job ID String jobId = response.jobId(); return jobId; } catch (TextractException e) { System.err.println(e.getMessage()); System.exit(1); } return "" ; } private static String getJobResults(TextractClient textractClient, String jobId) { boolean finished = false; int index = 0 ; String status = "" ; try { while (!finished) { GetDocumentAnalysisRequest analysisRequest = GetDocumentAnalysisRequest.builder() .jobId(jobId) .maxResults(1000) .build(); GetDocumentAnalysisResponse response = textractClient.getDocumentAnalysis(analysisRequest); status = response.jobStatus().toString(); if (status.compareTo("SUCCEEDED") == 0) finished = true; else { System.out.println(index + " status is: " + status); Thread.sleep(1000); } index++ ; } return status; } catch( InterruptedException e) { System.out.println(e.getMessage()); System.exit(1); } return ""; }
  • Trova istruzioni e altro codice su GitHub.

  • Per informazioni dettagliate, consulta.StartDocumentAnalysisnelAWS SDK for Java 2.xDocumentazione di riferimento API.

Python
SDK for Python (Boto3)

Avvia un lavoro asincrono per analizzare un documento.

class TextractWrapper: """Encapsulates Textract functions.""" def __init__(self, textract_client, s3_resource, sqs_resource): """ :param textract_client: A Boto3 Textract client. :param s3_resource: A Boto3 Amazon S3 resource. :param sqs_resource: A Boto3 Amazon SQS resource. """ self.textract_client = textract_client self.s3_resource = s3_resource self.sqs_resource = sqs_resource def start_analysis_job( self, bucket_name, document_file_name, feature_types, sns_topic_arn, sns_role_arn): """ Starts an asynchronous job to detect text and additional elements, such as forms or tables, in an image stored in an Amazon S3 bucket. Textract publishes a notification to the specified Amazon SNS topic when the job completes. The image must be in PNG, JPG, or PDF format. :param bucket_name: The name of the Amazon S3 bucket that contains the image. :param document_file_name: The name of the document image stored in Amazon S3. :param feature_types: The types of additional document features to detect. :param sns_topic_arn: The Amazon Resource Name (ARN) of an Amazon SNS topic where job completion notification is published. :param sns_role_arn: The ARN of an AWS Identity and Access Management (IAM) role that can be assumed by Textract and grants permission to publish to the Amazon SNS topic. :return: The ID of the job. """ try: response = self.textract_client.start_document_analysis( DocumentLocation={ 'S3Object': {'Bucket': bucket_name, 'Name': document_file_name}}, NotificationChannel={ 'SNSTopicArn': sns_topic_arn, 'RoleArn': sns_role_arn}, FeatureTypes=feature_types) job_id = response['JobId'] logger.info( "Started text analysis job %s on %s.", job_id, document_file_name) except ClientError: logger.exception("Couldn't analyze text in %s.", document_file_name) raise else: return job_id
  • Trova le istruzioni e altro codice su GitHub.

  • Per informazioni dettagliate, consulta.StartDocumentAnalysisnelAWSDocumentazione di riferimento dell'API SDK for Python (Boto3).

Per un elenco completo diAWSGuide per sviluppatori SDK ed esempi di codice, vediUtilizzo di Amazon Textract con unAWSSDK. Questo argomento include anche informazioni su come iniziare e dettagli sulle versioni precedenti dell'SDK.