AWS DeepRacer
Developer Guide

Evaluate Your AWS DeepRacer Models in Simulation

After your model is trained, use the AWS DeepRacer console to evaluate its performance. In AWS DeepRacer, the performance metric is the time required to complete a track. following the inferred actions as prescribed by the trained model.

To evaluate a trained model using the AWS DeepRacer console, follow the steps below.

  1. On your model's details page, under the Evaluation section, choose Start evaluation.

    You can start an evaluation after your model is in a Ready state. A model is ready when the training is complete. If the training is not complete, the model might also be in a ready state if its trained up to the failing point.

    
                    Image: AWS DeepRacer start evaluation after training completed.
  2. Under Select environment to start evaluation, choose an evaluation track.

    
                    Image: AWS DeepRacer select a track for evaluation.

    Typically, you want to choose a track that is the same as or similar to the one you used in training the model. You can choose any track for evaluating your model, however, you can expect the best performance on a track most similar to the one used in training.

  3. To specify the stop condition, leave the default value (3 trials) as-is for Number of trials. You can specify 3-5 trial runs for each evaluation.

  4. Choose Start evaluation to start creating and initializing the evaluation job.

    This initialization process takes about 3 minutes to complete.

    
                    Image: AWS DeepRacer evaluation initializing.
  5. After evaluation is in progress, choose Stop evaluation if you would like to stop the evaluation for any reason.

    
                    Image: AWS DeepRacer evaluation in progress.

    To stop an evaluation job, choose Stop evaluation on the upper-right corner of the Evaluation pane and then confirm to stop the evaluation.

  6. After evaluation completed successfully, inspect the Evaluation results to see how your model performs.

    
                    Image: AWS DeepRacer evaluation performance completed.

    The Time values listed under Evaluation results show how fast a trial lasts from the start position to the finish line or getting off-track. The Trial results (% completed) value shows the percentage of the track that was completed. A 100% value means the trial completed successfully.

    Anything less than 100 percent means the trial failed. A model with a less than 100 percent trial result is not ready for racing. You can anticipate such failure when the total reward in the training doesn't appear to have converged. In this case, you can continue to improve the model by cloning it, changing the reward function, tuning hyperparameters, and then iterating the process until the total reward converges and the performance metrics improve. For more information on how to improve the training, see Train and Evaluate AWS DeepRacer Models.

To transfer your completely trained model to your AWS DeepRacer vehicle for driving in a physical environment, you need to first download the model artifacts. To do so, choose Download model on the model's details page. For more information about testing an AWS DeepRacer model with a physical agent, see Operate Your AWS DeepRacer Vehicle .

If you've trained your model on a track identical or similar to the one chosen in a racing event of the AWS DeepRacer League, you can submit the model to the event. Do this by choosing DeepRacer League from the primary navigation pane on the AWS DeepRacer console. For more information, see Rank AWS DeepRacer Models in Leaderboard.