Evaluating trained model with AWS DeepComposer - AWS DeepComposer

Evaluating trained model with AWS DeepComposer

By examining the training results of a trained model, you can learn what makes a good model and how to assess the training performance of different models.

This section walks you through using the AWS DeepComposer console to examine a trained model's training results, including the loss graph that shows how the training of a model converges, how the structure similarity index progresses over the training epochs, and how the computed metrics of generated samples evolve over time in the training epochs.

To learn more about training and evaluating a AWS DeepComposer model

  1. Open the AWS DeepComposer console.

  2. In the navigation pane, choose Models.

  3. On the Models page, choose a model from the list of trained models.

  4. On the model's training results page under Loss, examine the Loss graph and the Epoch Explorer to ascertain the training performance of the selected model.

    The generator loss and discriminator loss are superimposed onto the same graph, but on different scales. The scale of the generator loss is shown on the left side and the scale of the discriminator loss is shown on the right. In this particular training job, the generator loss plateaus around the 50th epoch, when it stops significantly improving its capability to generate realistic music. The discriminator loss shows a similar behavior, but is less noisy after it plateaus. You can verify this by listening to the training sample output generated at fixed intervals of training epochs. The training sample outputs and the evaluation sample input are available in Epoch explorer. For every displayed sample output from the 100th epoch on, the generated accompaniment tracks sound like real music.

  5. On the model's training results page under Epoch vs similarity distance, choose Structure similarity index tab to examine how the similarity index progresses from epoch to epoch and how generated samples become more similar to the input samples.

    The structure similarity index measures how similar a generated music sample is to a training music sample. If the two are identical, the index is 1. If there is no similarity, the index is 0. Successful training should have the structure similarity index converge to a positive value. In this particular training job, the index appears to converge to 0.025 and the fluctuations fall within +/-0.005 after the 100th epoch.

  6. On the model's training results page under Summary of dataset, choose a metric from Compute metric.


    Supported metrics include Drum in pattern, Pitches used, Pitch classes used, Empty bar rate, Polyphonic range, and In scale ratio. The definition and a summary description of each metric is displayed in the AWS DeepComposer console after you choose the metric.