Secrets Manager에서 보안 인증 검색 에서 자격 증명을 검색하십시오. IAM

Apache Spark용 Amazon Redshift 통합으로 인증

자격 증명을 검색하고 Amazon Redshift에 연결하는 AWS Secrets Manager 데 사용

다음 코드 샘플은 Python의 Apache Spark용 PySpark 인터페이스를 사용하여 Amazon Redshift 클러스터에 연결하기 위해 자격 증명을 검색하는 데 사용할 AWS Secrets Manager 수 있는 방법을 보여줍니다.


from pyspark.sql import SQLContext
import boto3

sc = # existing SparkContext
sql_context = SQLContext(sc)

secretsmanager_client = boto3.client('secretsmanager')
secret_manager_response = secretsmanager_client.get_secret_value(
    SecretId='string',
    VersionId='string',
    VersionStage='string'
)
username = # get username from secret_manager_response
password = # get password from secret_manager_response
url = "jdbc:redshift://redshifthost:5439/database?user=" + username + "&password=" + password

# Read data from a table
df = sql_context.read \
    .format("io.github.spark_redshift_community.spark.redshift") \
    .option("url", url) \
    .option("dbtable", "my_table") \
    .option("tempdir", "s3://path/for/temp/data") \
    .load()

자격 증명을 검색하고 Amazon Redshift에 연결하는 IAM 데 사용

Amazon Redshift에서 제공하는 JDBC 버전 2 드라이버를 사용하여 Spark 커넥터로 Amazon Redshift에 연결할 수 있습니다. AWS Identity and Access Management (IAM) 를 사용하려면 인증을 사용하도록 구성하십시오. JDBC URL IAM EMRAmazon에서 Redshift 클러스터에 연결하려면 IAM 역할에 임시 IAM 자격 증명을 검색할 권한을 부여해야 합니다. 자격 증명을 검색하고 Amazon S3 작업을 실행할 수 있도록 IAM 역할에 다음 권한을 할당하십시오.

Redshift: GetClusterCredentials (프로비저닝된 아마존 Redshift 클러스터의 경우)
Redshift: DescribeClusters (프로비저닝된 아마존 Redshift 클러스터의 경우)
Redshift: GetWorkgroup (Amazon Redshift 서버리스 워크그룹용)
Redshift: GetCredentials (Amazon Redshift 서버리스 워크그룹용)
s3: GetBucket
s3: GetBucketLocation
s3: GetObject
s3: PutObject
s3: GetBucketLifecycleConfiguration

GetClusterCredentials에 대한 자세한 내용은 GetClusterCredentials에 대한 리소스 정책을 참조하세요.

또한 운영 중에 COPY Amazon Redshift가 IAM 역할을 맡을 수 있는지 확인해야 합니다. UNLOAD


{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "redshift.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

다음 예시에서는 스파크와 Amazon Redshift 간의 IAM 인증을 사용합니다.


from pyspark.sql import SQLContext
import boto3

sc = # existing SparkContext
sql_context = SQLContext(sc)

url = "jdbc:redshift:iam//redshift-host:redshift-port/db-name"
iam_role_arn = "arn:aws:iam::account-id:role/role-name"

# Read data from a table
df = sql_context.read \
    .format("io.github.spark_redshift_community.spark.redshift") \
    .option("url", url) \
    .option("aws_iam_role", iam_role_arn) \
    .option("dbtable", "my_table") \
    .option("tempdir", "s3a://path/for/temp/data") \
    .mode("error") \
    .load()

javascript가 브라우저에서 비활성화되거나 사용이 불가합니다.

AWS 설명서를 사용하려면 Javascript가 활성화되어야 합니다. 지침을 보려면 브라우저의 도움말 페이지를 참조하십시오.

문서 규칙

Spark 애플리케이션 시작

Amazon Redshift에 대한 읽고 쓰기