Monitor individual user resource access from SageMaker Studio Classic with sourceIdentity - Amazon SageMaker

Monitor individual user resource access from SageMaker Studio Classic with sourceIdentity

With Amazon SageMaker Studio Classic, you can monitor user resource access. To view resource access activity, you can configure AWS CloudTrail to monitor and record user activities by following the steps in Log Amazon SageMaker API Calls with AWS CloudTrail.

However, the AWS CloudTrail logs for resource access only list the Studio Classic execution IAM role as the identifier. This level of logging is enough to audit user activity when each user profile has a distinct execution role. However, when a single execution IAM role is shared between several user profiles, you can't get information about the specific user that accessed the AWS resources. 

You can get information about which specific user performed an action in an AWS CloudTrail log when using a shared execution role, using the sourceIdentity configuration to propagate the Studio Classic user profile name. For more information about source identity, see Monitor and control actions taken with assumed roles. To turn sourceIdentity on or off for your CloudTrail logs, see Turn on sourceIdentity in CloudTrail logs for SageMaker Studio Classic.

Considerations when using sourceIdentity

When you make AWS API calls from Studio Classic notebooks, SageMaker Canvas, or Amazon SageMaker Data Wrangler, the sourceIdentity is only recorded in CloudTrail if those calls are made using the Studio Classic execution role session or any chained role from that session.

When these API calls invoke other services to perform additional operations, sourceIdentity logging depends on the specific implementation of the invoked services.

  • Amazon SageMaker Processing: When you create a job using these features, the job creation APIs are not able to ingest the sourceIdentity that exists in the session. As a result, any AWS API calls made from these jobs do not record sourceIdentity in the CloudTrail logs.

  • Amazon SageMaker Training: When you create a training job, the job creation APIs are able to ingest the sourceIdentity that exists in the session. As a result, any AWS API calls made from these jobs record sourceIdentity in the CloudTrail logs.

  • Amazon SageMaker Pipelines: When you create jobs using automated CI/CD pipelines, sourceIdentity propagates downstream and can be viewed in the CloudTrail logs.

  • Amazon EMR: When connecting to Amazon EMR from Studio Classic using runtime roles, administrators must explicitly set the PropagateSourceIdentity field. This ensures that Amazon EMR applies the sourceIdentity from the calling credentials to a job or query session. The sourceIdentity is then recorded in CloudTrail logs.

Note

The following exceptions apply when using sourceIdentity.

  • SageMaker Studio Classic shared spaces do not support sourceIdentity passthrough. AWS API calls made from SageMaker shared spaces do not record sourceIdentity in CloudTrail logs.

  • If AWS API calls are made from sessions that are created by users or other services and the sessions are not based on the Studio Classic execution role session, then the sourceIdentity is not recorded in CloudTrail logs.