搭StartCrawler配使用 AWS SDK或 CLI - AWS 連接詞

本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。

StartCrawler配使用 AWS SDK或 CLI

下列程式碼範例會示範如何使用StartCrawler

動作範例是大型程式的程式碼摘錄,必須在內容中執行。您可以在下列程式碼範例的內容中看到此動作:

.NET
AWS SDK for .NET
注意

還有更多關於 GitHub。尋找完整範例,並瞭解如何在 AWS 代碼示例存儲庫

/// <summary> /// Start an AWS Glue crawler. /// </summary> /// <param name="crawlerName">The name of the crawler.</param> /// <returns>A Boolean value indicating the success of the action.</returns> public async Task<bool> StartCrawlerAsync(string crawlerName) { var crawlerRequest = new StartCrawlerRequest { Name = crawlerName, }; var response = await _amazonGlue.StartCrawlerAsync(crawlerRequest); return response.HttpStatusCode == System.Net.HttpStatusCode.OK; }
  • 有API關詳細資訊,請參閱 StartCrawlerAWS SDK for .NET API參考

C++
SDK對於 C ++
注意

還有更多關於 GitHub。尋找完整範例,並瞭解如何在 AWS 代碼示例存儲庫

Aws::Client::ClientConfiguration clientConfig; // Optional: Set to the AWS Region in which the bucket was created (overrides config file). // clientConfig.region = "us-east-1"; Aws::Glue::GlueClient client(clientConfig); Aws::Glue::Model::StartCrawlerRequest request; request.SetName(CRAWLER_NAME); Aws::Glue::Model::StartCrawlerOutcome outcome = client.StartCrawler(request); if (outcome.IsSuccess() || (Aws::Glue::GlueErrors::CRAWLER_RUNNING == outcome.GetError().GetErrorType())) { if (!outcome.IsSuccess()) { std::cout << "Crawler was already started." << std::endl; } else { std::cout << "Successfully started crawler." << std::endl; } std::cout << "This may take a while to run." << std::endl; Aws::Glue::Model::CrawlerState crawlerState = Aws::Glue::Model::CrawlerState::NOT_SET; int iterations = 0; while (Aws::Glue::Model::CrawlerState::READY != crawlerState) { std::this_thread::sleep_for(std::chrono::seconds(1)); ++iterations; if ((iterations % 10) == 0) { // Log status every 10 seconds. std::cout << "Crawler status " << Aws::Glue::Model::CrawlerStateMapper::GetNameForCrawlerState( crawlerState) << ". After " << iterations << " seconds elapsed." << std::endl; } Aws::Glue::Model::GetCrawlerRequest getCrawlerRequest; getCrawlerRequest.SetName(CRAWLER_NAME); Aws::Glue::Model::GetCrawlerOutcome getCrawlerOutcome = client.GetCrawler( getCrawlerRequest); if (getCrawlerOutcome.IsSuccess()) { crawlerState = getCrawlerOutcome.GetResult().GetCrawler().GetState(); } else { std::cerr << "Error getting crawler. " << getCrawlerOutcome.GetError().GetMessage() << std::endl; break; } } if (Aws::Glue::Model::CrawlerState::READY == crawlerState) { std::cout << "Crawler finished running after " << iterations << " seconds." << std::endl; } } else { std::cerr << "Error starting a crawler. " << outcome.GetError().GetMessage() << std::endl; deleteAssets(CRAWLER_NAME, CRAWLER_DATABASE_NAME, "", bucketName, clientConfig); return false; }
  • 有API關詳細資訊,請參閱 StartCrawlerAWS SDK for C++ API參考

CLI
AWS CLI

啟動爬蟲程式

以下 start-crawler 範例會啟動爬蟲程式。

aws glue start-crawler --name my-crawler

輸出:

None

如需詳細資訊,請參閱 < 定義搜尋器AWS Glue 開發人員指南

  • 有API關詳細資訊,請參閱 StartCrawlerAWS CLI 指令參考

Java
SDK對於爪哇 2.x
注意

還有更多關於 GitHub。尋找完整範例,並瞭解如何在 AWS 代碼示例存儲庫

import software.amazon.awssdk.regions.Region; import software.amazon.awssdk.services.glue.GlueClient; import software.amazon.awssdk.services.glue.model.GlueException; import software.amazon.awssdk.services.glue.model.StartCrawlerRequest; /** * Before running this Java V2 code example, set up your development * environment, including your credentials. * * For more information, see the following documentation topic: * * https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/get-started.html */ public class StartCrawler { public static void main(String[] args) { final String usage = """ Usage: <crawlerName> Where: crawlerName - The name of the crawler.\s """; if (args.length != 1) { System.out.println(usage); System.exit(1); } String crawlerName = args[0]; Region region = Region.US_EAST_1; GlueClient glueClient = GlueClient.builder() .region(region) .build(); startSpecificCrawler(glueClient, crawlerName); glueClient.close(); } public static void startSpecificCrawler(GlueClient glueClient, String crawlerName) { try { StartCrawlerRequest crawlerRequest = StartCrawlerRequest.builder() .name(crawlerName) .build(); glueClient.startCrawler(crawlerRequest); } catch (GlueException e) { System.err.println(e.awsErrorDetails().errorMessage()); System.exit(1); } } }
  • 有API關詳細資訊,請參閱 StartCrawlerAWS SDK for Java 2.x API參考

JavaScript
SDK對於 JavaScript (3)
注意

還有更多關於 GitHub。尋找完整範例,並瞭解如何在 AWS 代碼示例存儲庫

const startCrawler = (name) => { const client = new GlueClient({}); const command = new StartCrawlerCommand({ Name: name, }); return client.send(command); };
  • 有API關詳細資訊,請參閱 StartCrawlerAWS SDK for JavaScript API參考

Kotlin
SDK對於科特林
注意

還有更多關於 GitHub。尋找完整範例,並瞭解如何在 AWS 代碼示例存儲庫

suspend fun startSpecificCrawler(crawlerName: String?) { val request = StartCrawlerRequest { name = crawlerName } GlueClient { region = "us-west-2" }.use { glueClient -> glueClient.startCrawler(request) println("$crawlerName was successfully started.") } }
  • 有API關詳細資訊,請參閱 StartCrawlerAWS SDK為科特林API參考。

PHP
適用於 PHP 的 SDK
注意

還有更多關於 GitHub。尋找完整範例,並瞭解如何在 AWS 代碼示例存儲庫

$crawlerName = "example-crawler-test-" . $uniqid; $databaseName = "doc-example-database-$uniqid"; $glueService->startCrawler($crawlerName); public function startCrawler($crawlerName): Result { return $this->glueClient->startCrawler([ 'Name' => $crawlerName, ]); }
  • 有API關詳細資訊,請參閱 StartCrawlerAWS SDK for PHP API參考

Python
SDK對於 Python(肉毒桿菌 3)
注意

還有更多關於 GitHub。尋找完整範例,並瞭解如何在 AWS 代碼示例存儲庫

class GlueWrapper: """Encapsulates AWS Glue actions.""" def __init__(self, glue_client): """ :param glue_client: A Boto3 Glue client. """ self.glue_client = glue_client def start_crawler(self, name): """ Starts a crawler. The crawler crawls its configured target and creates metadata that describes the data it finds in the target data source. :param name: The name of the crawler to start. """ try: self.glue_client.start_crawler(Name=name) except ClientError as err: logger.error( "Couldn't start crawler %s. Here's why: %s: %s", name, err.response["Error"]["Code"], err.response["Error"]["Message"], ) raise
  • 有API關詳細資訊,請參閱 StartCrawlerAWS SDK對於 Python(肉毒桿 3)API參考。

Ruby
SDK對於紅寶石
注意

還有更多關於 GitHub。尋找完整範例,並瞭解如何在 AWS 代碼示例存儲庫

# The `GlueWrapper` class serves as a wrapper around the AWS Glue API, providing a simplified interface for common operations. # It encapsulates the functionality of the AWS SDK for Glue and provides methods for interacting with Glue crawlers, databases, tables, jobs, and S3 resources. # The class initializes with a Glue client and a logger, allowing it to make API calls and log any errors or informational messages. class GlueWrapper def initialize(glue_client, logger) @glue_client = glue_client @logger = logger end # Starts a crawler with the specified name. # # @param name [String] The name of the crawler to start. # @return [void] def start_crawler(name) @glue_client.start_crawler(name: name) rescue Aws::Glue::Errors::ServiceError => e @logger.error("Glue could not start crawler #{name}: \n#{e.message}") raise end
  • 有API關詳細資訊,請參閱 StartCrawlerAWS SDK for Ruby API參考

Rust
SDK對於銹
注意

還有更多關於 GitHub。尋找完整的範例,並瞭解如何設定和執行 AWS 代碼示例存儲庫

let start_crawler = glue.start_crawler().name(self.crawler()).send().await; match start_crawler { Ok(_) => Ok(()), Err(err) => { let glue_err: aws_sdk_glue::Error = err.into(); match glue_err { aws_sdk_glue::Error::CrawlerRunningException(_) => Ok(()), _ => Err(GlueMvpError::GlueSdk(glue_err)), } } }?;
  • 有API關詳細資訊,請參閱 StartCrawlerAWS SDKAPI供鐵鏽參考

有關的完整列表 AWS SDK開發人員指南和代碼示例,請參閱使用此服務 AWS SDK。本主題也包含有關入門的資訊以及舊SDK版的詳細資訊。