We are no longer updating the Amazon Machine Learning service or accepting new users for it. This documentation is available for existing users, but we are no longer updating it. For more information, see What is Amazon Machine Learning.
Requesting Real-time Predictions
A real-time prediction is a synchronous call to Amazon Machine Learning (Amazon ML).
The prediction is made when Amazon ML gets the request, and the response is returned immediately.
Real-time predictions are commonly used to enable predictive capabilities within interactive
web, mobile, or desktop applications. You can query an ML model created with Amazon ML
for predictions in real time by using the low-latency Predict
API. The
Predict
operation accepts a single input observation in the request payload, and
returns the prediction synchronously in the response. This sets it apart from the batch
prediction API, which is invoked with the ID of an Amazon ML datasource object that points to the
location of the input observations, and asynchronously returns a URI to a file that contains
predictions for all these observations. Amazon ML responds to most real-time prediction requests
within 100 milliseconds.
You can try real-time predictions without incurring charges in the Amazon ML console. If you
then decide to use real-time predictions, you must first create an endpoint for
real-time prediction generation. You can do this in the Amazon ML console or by using the
CreateRealtimeEndpoint
API. After you have an endpoint, use the real-time
prediction API to generate real-time predictions.
Note
After you create a real-time endpoint for your model, you will start incurring a capacity
reservation charge that is based on the model's size. For more information, see PricingDeleteRealtimeEndpoint
operation.
For examples of Predict
requests and responses, see Predict in the Amazon Machine Learning API Reference.
To see an example of the exact response format that uses your model, see Trying Real-Time Predictions.
Topics
Trying Real-Time Predictions
To help you decide whether to enable real-time prediction, Amazon ML allows you to try generating predictions on single data records without incurring the additional charges associated with setting up a real-time prediction endpoint. To try real-time prediction, you must have an ML model. To create real-time predictions on a larger scale, use the Predict API in the Amazon Machine Learning API Reference.
To try real-time predictions
-
Sign in to the AWS Management Console and open the Amazon Machine Learning console at https://console.aws.amazon.com/machinelearning/
. -
In the navigation bar, in the Amazon Machine Learning drop down, choose ML models.
-
Choose the model that you want to use to try real-time predictions, such as the
Subscription propensity model
from the tutorial. -
On the ML model report page, under Predictions, choose Summary, and then choose Try real-time predictions.
Amazon ML shows a list of the variables that made up the data records that Amazon ML used to train your model.
-
You can proceed by entering data in each of the fields in the form or by pasting a single data record, in CSV format, into the text box.
To use the form, for each Value field, enter the data that you want to use to test your real-time predictions. If the data record you are entering does not contain values for one or more data attributes, leave the entry fields blank.
To provide a data record, choose Paste a record. Paste a single CSV-formatted row of data into the text field, and choose Submit. Amazon ML auto-populates the Value fields for you.
Note
The data in the data record must have the same number of columns as the training data, and be arranged in the same order. The only exception is that you should omit the target value. If you include a target value, Amazon ML ignores it.
-
At the bottom of the page, choose Create prediction. Amazon ML returns the prediction immediately.
In the Prediction results pane, you see the prediction object that the
Predict
API call returns, along with the ML model type, the name of the target variable, and the predicted class or value. For information about interpreting the results, see Interpreting the Contents of Batch Prediction Files for a Binary Classification ML model.
Creating a Real-Time Endpoint
To generate real-time predictions, you need to create a real-time endpoint.
To create a real-time endpoint, you must already have an ML model for which you want to
generate real-time predictions. You can create a real-time endpoint by using the
Amazon ML console or by calling the CreateRealtimeEndpoint
API. For more
information on using the CreateRealtimeEndpoint
API, see
https://docs.aws.amazon.com/machine-learning/latest/APIReference/API_CreateRealtimeEndpoint.html in the Amazon Machine Learning API Reference.
To create a real-time endpoint
-
Sign in to the AWS Management Console and open the Amazon Machine Learning console at https://console.aws.amazon.com/machinelearning/
. -
In the navigation bar, in the Amazon Machine Learning drop down, choose ML models.
-
Choose the model for which you want to generate real-time predictions.
-
On the ML model summary page, under Predictions, choose Create real-time endpoint.
A dialog box that explains how real-time predictions are priced appears.
-
Choose Create. The real-time endpoint request is sent to Amazon ML and entered into a queue. The status of the real-time endpoint is Updating.
-
When the real-time endpoint is ready, the status changes to Ready, and Amazon ML displays the endpoint URL. Use the endpoint URL to create real-time prediction requests with the
Predict
API. For more information about using thePredict
API, see https://docs.aws.amazon.com/machine-learning/latest/APIReference/API_Predict.html in the Amazon Machine Learning API Reference.
Locating the Real-time Prediction Endpoint (Console)
To use the Amazon ML console to find the endpoint URL for an ML model navigate to the model's ML model summary page.
To locate a real-time endpoint URL
-
Sign in to the AWS Management Console and open the Amazon Machine Learning console at https://console.aws.amazon.com/machinelearning/
. -
In the navigation bar, in the Amazon Machine Learning drop down, choose ML models.
-
Choose the model for which you want to generate real-time predictions.
-
On the ML model summary page, scroll down until you see the Predictions section.
-
The endpoint URL for the model is listed in Real-time prediction. Use the URL as the Endpoint Url URL for your real-time prediction calls. For information on how to use the endpoint to generate predictions, see https://docs.aws.amazon.com/machine-learning/latest/APIReference/API_Predict.html in the Amazon Machine Learning API Reference.
Locating the Real-time Prediction Endpoint (API)
When you
create a real-time endpoint by using the CreateRealtimeEndpoint
operation, the
URL and status of the endpoint is returned to you in the response. If you created the
real-time endpoint by using the console or if you want to retrieve the URL and status of an
endpoint that you created earlier, call the GetMLModel
operation with the ID of
the model that you want to query for real-time
predictions. The endpoint information is contained in the EndpointInfo
section of
the response. For a model that has a real-time endpoint associated with it, the
EndpointInfo
might look like this:
"EndpointInfo":{ "CreatedAt": 1427864874.227, "EndpointStatus": "READY", "EndpointUrl": "https://endpointUrl", "PeakRequestsPerSecond": 200 }
A model without a real-time endpoint would return the following:
EndpointInfo":{ "EndpointStatus": "NONE", "PeakRequestsPerSecond": 0 }
Creating a Real-time Prediction Request
A sample Predict
request payload might look like this:
{ "MLModelId": "model-id", "Record":{ "key1": "value1", "key2": "value2" }, "PredictEndpoint": "https://endpointUrl" }
The PredictEndpoint
field must correspond to the EndpointUrl
field of the EndpointInfo
structure. Amazon ML uses this field to route the request to the
appropriate servers in the real-time prediction fleet.
The MLModelId
is the identifier of a previously trained model with a
real-time endpoint.
A Record
is a map of variable names to variable values. Each pair represents
an observation. The Record
map contains the inputs to your Amazon ML model. It is
analogous to a single row of data in your training data set, without the target variable.
Regardless of the type of values in the training data, Record
contains a
string-to-string mapping.
Note
You can omit variables for which you do not have a value, although this might reduce the accuracy of your prediction. The more variables you can include, the more accurate your model is.
The format of the response returned by Predict
requests depends on the type
of model that is being queried for prediction. In all cases, the details
field
contains information about the prediction request, notably including the
PredictiveModelType
field with the model type.
The following example shows a response for a binary model:
{ "Prediction":{ "details":{ "PredictiveModelType": "BINARY" }, "predictedLabel": "0", "predictedScores":{ "0": 0.47380468249320984 } } }
Notice the predictedLabel
field that contains the predicted label, in this
case 0. Amazon ML computes the predicted label by comparing the prediction score against the
classification cut-off:
-
You can obtain the classification cut-off that is currently associated with an ML model by inspecting the
ScoreThreshold
field in the response of theGetMLModel
operation, or by viewing the model information in the Amazon ML console. If you do not set a score threshold, Amazon ML uses the default value of 0.5. -
You can obtain the exact prediction score for a binary classification model by inspecting the
predictedScores
map. Within this map, the predicted label is paired with the exact prediction score.
For more information about binary predictions, see Interpreting the Predictions.
The following example shows a response for a regression model. Notice that the predicted
numeric value is found in the predictedValue
field:
{ "Prediction":{ "details":{ "PredictiveModelType": "REGRESSION" }, "predictedValue": 15.508452415466309 } }
The following example shows a response for a multiclass model:
{ "Prediction":{ "details":{ "PredictiveModelType": "MULTICLASS" }, "predictedLabel": "red", "predictedScores":{ "red": 0.12923571467399597, "green": 0.08416014909744263, "orange": 0.22713537514209747, "blue": 0.1438363939523697, "pink": 0.184102863073349, "violet": 0.12816807627677917, "brown": 0.10336143523454666 } } }
Similar to binary classification models, the predicted label/class is found in the
predictedLabel
field. You can further understand how strongly the prediction is
related to each class by looking at the predictedScores
map. The higher the score
of a class within this map, the more strongly the prediction is related to the class, with the
highest value ultimately being selected as the predictedLabel
.
For more information about multiclass predictions, see Multiclass Model Insights.
Deleting a Real-Time Endpoint
When you've completed your real-time predictions, delete the real-time endpoint to avoid incurring additional charges. Charges stop accruing as soon as you delete your endpoint.
To delete a real-time endpoint
-
Sign in to the AWS Management Console and open the Amazon Machine Learning console at https://console.aws.amazon.com/machinelearning/
. -
In the navigation bar, in the Amazon Machine Learning drop down, choose ML models.
-
Choose the model that no longer requires real-time predictions.
-
On the ML model report page, under Predictions, choose Summary.
-
Choose Delete real-time endpoint.
-
In the Delete real-time endpoint dialog box, choose Delete.