Creating an Amazon ML Amazon Redshift IAM Role - Amazon Machine Learning

We are no longer updating the Amazon Machine Learning service or accepting new users for it. This documentation is available for existing users, but we are no longer updating it. For more information, see What is Amazon Machine Learning.

Creating an Amazon ML Amazon Redshift IAM Role

Before you can create a datasource with Amazon Redshift data, you must set up IAM permissions that allow Amazon ML to export data from Amazon Redshift.

Amazon ML needs inbound access to your Amazon Redshift cluster to establish a connection. Amazon Redshift cluster security groups govern inbound access to Amazon Redshift clusters. (For details, see Amazon Redshift Security Cluster Groups in the Amazon Redshift Cluster Management Guide.) To create these security groups and provide the name of a role that grants inbound access to Amazon ML, you need to create an IAM role with Amazon Redshift cluster permissions. Amazon ML uses this role to configure access to your cluster from the list of IP addresses (CIDRs) associated with Amazon ML. Amazon ML can use the same role later to automatically update the list of CIDRs associated with Amazon ML, with no action required on your part.

When Amazon ML executes the Amazon Redshift query to retrieve your data, it places the results in an intermediate Amazon S3 location. By configuring your IAM role with permissions to create and retrieve Amazon S3 objects and modify bucket policies, you can eliminate the need to configure and manage these permissions every time you create a datasource. To do this, you need to grant the following permissions to Amazon ML:

  • s3:PutObject: Grants Amazon ML the permissions to write the data from your Amazon Redshift query to the Amazon S3 location

  • s3:ListBucket and s3:GetObject: Grant Amazon ML the permissions to access and read the results from the Amazon S3 bucket to create a datasource

  • s3:PutObjectAcl: Enables Amazon ML to give the bucket owner full control of the data stored in Amazon S3

This section walks you through creating three IAM policies. The first policy is the role that Amazon ML uses to access Amazon Redshift and your Amazon S3 location. The second policy is a trust policy that allows Amazon ML to use the first policy. The third policy is a permissions policy that allows an IAM user to pass the first policy to Amazon ML. You must attach this policy to your IAM user.

Creating the Amazon ML IAM Role for Amazon Redshift

To provide Amazon ML access to Amazon Redshift and Amazon S3, create an IAM role and a trust policy that allows Amazon ML to use the role.

An IAM role has two parts: a permissions policy (or policies) that states the permissions given to the role, and a trust policy that states who can assume the role. You can use a managed policy that sets the required permissions and trust policies for you, or you can create and configure the Amazon ML role for Amazon Redshift yourself.

To use the managed Amazon ML Amazon Redshift policy to set permissions and trust policies

  1. Sign in to the AWS Management Console and open the IAM console at

  2. In the Navigation pane, choose Roles.

  3. In the navigation bar, choose Create New Role.

  4. For Role Name, type the name for your Amazon ML Amazon Redshift role, and then choose Next Step.

  5. For Select Role Type, choose AWS Service Roles.

  6. Scroll through the AWS Service Roles until you find Amazon Machine Learning Role for Redshift Data Source. Next to Amazon Machine Learning Role for Redshift Data Source, choose Select.

  7. On the Attach Policy page, choose AmazonMachineLearningRoleforRedshiftDataSource, and then choose Next Step.

  8. Choose Create Role.

This is the Amazon Resource Name (ARN) for this policy:


To create the Amazon ML role

  1. Use the following permissions policy to create a role for Amazon ML in the IAM console. The permissions policy creates a role that gives permission to create and modify Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Redshift security groups, and configures read and write access to the Amazon S3 location where data uploaded from Amazon Redshift will be stored.

    { "Version": "2012-10-17", "Statement": [{ "Action": [ "ec2:DescribeSecurityGroups", "ec2:AuthorizeSecurityGroupIngress", "ec2:CreateSecurityGroup", "ec2:DescribeInternetGateways", "ec2:RevokeSecurityGroupIngress", "s3:GetObject", "s3:GetBucketLocation", "s3:GetBucketPolicy", "s3:PutBucketPolicy", "s3:PutObject", "redshift:CreateClusterSecurityGroup", "redshift:AuthorizeClusterSecurityGroupIngress", "redshift:RevokeClusterSecurityGroupIngress", "redshift:DescribeClusterSecurityGroups", "redshift:DescribeClusters", "redshift:ModifyCluster" ], "Effect": "Allow", "Resource": [ "*" ] }] }

    For more information about creating a role for Amazon ML, see Creating a Role to Delegate Permissions to an AWS Service in the IAM User Guide.

  2. After you create a role for Amazon ML, create a trust policy for the role, and attach the trust policy to the role. The trust policy controls who is allowed to assume the role. The following trust policy allows Amazon ML to assume the role that is defined in the preceding example.

    { "Version": "2008-10-17", "Statement": [{ "Sid": "", "Effect": "Allow", "Principal": { "Service": "" }, "Action": "sts:AssumeRole" }] }

    For more information about adding a trust policy to a role, see Modifying a Role in the IAM User Guide.

  3. To allow your IAM user to pass the role that you just created to Amazon ML, you must attach a permissions policy with the iam:PassRole permission to your IAM user. If necessary, you can restrict your IAM user to passing specific roles by replacing the * in the Resource field with the ARN of the role that you want to pass. To pass a role to Amazon ML, attach the following policy to your IAM user.

    { "Version": "2012-10-17", "Statement": [{ "Effect": "Allow", "Action": [ "iam:PassRole" ], "Resource": [ "*" ] }] }

After you have created and configured the role, the trust policy, and your IAM user, you can use the Amazon ML console to create a datasource from Amazon Redshift data.