Security in SageMaker smart sifting

Focus mode

Security in SageMaker smart sifting - Amazon SageMaker AI

Because the SageMaker smart sifting library runs processes of removing less valuable training samples, it requires full access to training datasets as they are produced by the data loader. This access is not different than the access already provided to PyTorch in normal training scenario.

SageMaker smart sifting has built-in logging with security implications. By default, SageMaker smart sifting logs are only application-level logs containing metrics, latencies, and user errors or warnings. Users can, however, choose to enable verbose logs, which log full batch data to show which samples were removed from a given batch. These logs are emitted using Python loggers and are not uploaded or stored anywhere by the library. In the case of automatic log uploading to CloudWatch or similar services, please note that using verbose logs may result in sensitive training data being uploaded off of the training instance.

Beyond the aforementioned logging, SageMaker smart sifting does not have any network functionality nor does it interact with the local file system. User data is stored as in-memory objects for the entirety of the time it is used by the library.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Troubleshooting

SageMaker smart sifting Python SDK reference

Select your cookie preferences

Customize cookie preferences

Essential

Performance

Functional

Advertising

Unable to save cookie preferences

Security in SageMaker smart sifting

Related resources

Did this page help you?

Related resources

Next topic:

Previous topic:

Need help?