Implementation plan Documents Blogs Videos Examples

MLCOST-08: Enable feature reusability

Reduce duplication and the rerunning of feature engineering code across teams and projects by using feature storage. The store should have online and oﬄine storage, and data encryption capabilities. An online store with low-latency retrieval capabilities is ideal for real-time inference. An oﬄine store maintains a history of feature values and is suited for training and batch scoring.

Implementation plan

Use Amazon SageMaker AI Feature Store - Amazon SageMaker AI Feature Store is a fully managed, purpose-built repository to store, update, retrieve, and share ML features. Feature Store makes it easy for data scientists, machine learning engineers, and general practitioners to create, share, and manage features for ML development. The online store is used for low latency, real-time inference use cases. The oﬄine store is used for training and batch inference. The Feature Store reduces the repetitive data processing and curation work required to convert raw data into features for training an ML algorithm.

You can use Feature Store in the following modes:

Online - Features are read with low latency reads (milliseconds) and used for high throughput predictions.
Oﬄine - Large streams of data are fed to an oﬄine store, which is used for training and batch inference. This mode requires a feature group to be stored in an oﬄine store. The oﬄine store uses your S3 bucket for storage and can also fetch data using Amazon Athena queries.
Online and oﬄine - This includes both online and oﬄine modes.

Documents

Create, Store, and Share Features with Amazon SageMaker AI Feature Store

Blogs

Videos

Amazon SageMaker AI Feature Store Deep Dive Demo

Examples

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

MLCOST-07: Use managed data processing capabilities

Sustainability pillar best practices