AWS DeepRacer
Developer Guide

This is prerelease documentation for a service in preview release. It is subject to change.

Train Your First AWS DeepRacer Model for Autonomous Racing

Using the AWS DeepRacer console, you can follow built-in templates to train and evaluate an AWS DeepRacer model.

To train a reinforcement learning model for autonomous racing using the AWS DeepRacer console

  1. Sign in to the AWS DeepRacer console (https://console.aws.amazon.com/deepracer).

  2. On the AWS DeepRacer home page, choose Create model.

    If you aren't on the home page, choose Reinforcement learning on the primary navigation pane and then choose Create model.

  3. Under Model details on the Create model page, do the following:

    1. Type a name for the to-be-trained model in the Model name input field. Use this name to identify this model in the list of AWS DeepRacer models you've created or on the leaderboards displaying the evaluation metrics of the model.

      
                            Image: AWS DeepRacer mode being created.
    2. Optionally, provide a brief description of the model in the Model description - optional input field. The description, for example, can provide a summary of the model features and limitations.

    3. For Permissions and storage, choose the Create resources button to create the required IAM roles and an S3 bucket, if they don't already exist.

      If the resources have been created, you're notified as such as shown in the following screenshot:

      
                            Image: AWS DeepRacer required resources created.

      The S3 bucket is used to store the trained model artifacts. And the IAM roles contain relevant IAM policies to grant AWS DeepRacer permissions to call other AWS services on your behalf. For more information about the required IAM roles and policies, see Identity and Access Management for AWS DeepRacer.

  4. Under Environment simulation, choose an available track as a virtual environment with which your agent interacts to train a reinforcement learning model through trials and errors.

  5. Under Reward function, choose the Basic function or Advanced function and use the predefined code without modification.

    To modify or replace the predefined reward function code, choose the Insert code button to insert the code into the code editor and change it.

  6. To customize hyperparameters, expand Algorithm settings and set Hyperparameters as follows:

    1. For Batch size, choose an available options or leave the default choice (64) as-is.

    2. For Number of epochs, set a valid value or leave the default (10) as-is.

    3. For Learning rate, set a valid value or leave the default value (0.0003) as-is.

    4. For Exploration, choose one of the available options or leave the default value (Categorical Parameters) as-is.

    5. For Entropy, set a valid value or leave the default value (0.01) as-is.

    6. For Discount factor, set a valid value or leave the default value (0.99) as-is.

    7. For Loss type, choose an available options or leave the default choice (Huber) as-is.

    8. For Number of episodes between each training, set a valid value or leave the default value (20) as-is.

    For more information about hyperparameters, see Systematically Tune Hyperparameters for Optimal Training Performances.

  7. Under Stop conditions, set the conditions to terminate long-running (and possible run-away) training session:

    1. For Max. Time, set a valid value or leave the default value (60 mins) as-is.

    For more information about stop conditions, see Train and Evaluate AWS DeepRacer Models.

  8. Choose Start training to start creating the model and provisions the training job instance. The process takes a few minutes. You can watch the status change on the Models page. Af

  9. After the training is initialized, the status becomes Training. You can now choose the model name to open the mode details page.

    
                    Image: AWS DeepRacer training status.
  10. On the model details page, watch the training progress in the simulator.

    
                    Image: AWS DeepRacer inspect training.

    As seen in the simulator, as the agent maneuvers the curve on the track, it gets repeatedly knocked off the track until it masters the correct steering. The training stops when the specified stop condition is met. The training has converged if the TrainingRewardScore value plateaus with respect to the training time. You can choose Stop training to manually stop the training job when the training has converged before the specified stop condition is met. If the training has not converged when the training stops, you should do the following: Create a new training job. Tune some hyperparameters or extend the training time. Repeat the training until the average reward converges. If you choose Stop training, the process could take a few minutes to complete.

After the training job finishes, continue next to evaluate the trained model to gauge its performance.