Setting Up ML-Powered Anomaly Detection for Outlier Analysis - Amazon QuickSight

Setting Up ML-Powered Anomaly Detection for Outlier Analysis

Use the following procedure to start detecting outliers, and the key drivers that contribute to them, by adding an insight widget that uses ML-powered anomaly detection.

To set up outlier analysis, including contributing drivers

  1. Open your analysis and, on the top menu, choose Add, then Add insight. From the list, choose Anomaly detection and Select.

    Doing this creates a widget for the insight (also called an autonarrative).

  2. Follow the screen prompt on the new widget, which tells you to choose fields for the insight. Add at least one date, one measure, and one dimension. You can add up to five dimension (category) fields. However, you can't use calculated fields.

    In the field wells, Categories represent the dimensional values that Amazon QuickSight uses to split the metric. For example, let's say you are analyzing revenue across all product categories and product SKUs. There are 10 product categories, each with 10 product SKUs. Amazon QuickSight splits the metric by the 100 unique combinations and runs anomaly detection on each combination for the split metric.

  3. Choose Get started on the widget.

  4. The configuration screen appears. Configure one or more of the settings described in the following steps.

    Note

    This is a scrollable screen. To see all of the configuration options, you might need to navigate using the scrollbar, the mouse wheel, or the up and down arrow keys.

    1. (Optional) In the Fields for analysis section, you can see a list of fields from the field wells, for reference purposes. If you have chosen three categories (dimensions), Amazon QuickSight analyzes the following combinations hierarchically: A, AB, ABC. If you enable the option to Analyze all combinations of these categories, Amazon QuickSight analyzes all combinations: A, AB, ABC, BC, AC.

      To enable analysis of all combinations, select the check box. Also, if your data isn't hierarchical, make sure to enable this option.

    2. (Optional) In the Schedule section, you can set the schedule for automatically running the insight recalculation. The schedule runs only for published dashboards. In the analysis, you can run it manually as needed. Scheduling includes the following settings:

      • Occurrence – Set how often that you want the recalculation to run: every hour, every day, every week, or every month.

      • Start schedule on – Set the date and time to start running this schedule.

      • Timezone – Set the time zone that the schedule runs in. To view a list, delete the current entry.

    3. (Optional) The Contribution analysis (optional) setting allows QuickSight to analyze the key drivers when an outlier (anomaly) is detected. For example, QuickSight can show you the top customers that contributed to a spike in sales in the US for home improvement products. You can add up to four dimensions from your dataset, including dimensions that you didn't add to the field wells of this insight widget.

      To view a list of dimensions available for contribution analysis, choose Select fields.

    4. (Optional) To open the Advanced options section, scroll to the bottom of the screen, and choose Advanced options.

    5. (Optional) For Computation Name, in the Advanced options section, provide a descriptive alphanumeric name with no spaces. This name is for the computation itself, not the insight widget. If you plan on editing the narrative that automatically displays on the widget, you can use the name to identify this widget's calculation. It's a good practice to choose a name that makes it easy to distinguish this computation from another. However, you would probably only customize the name if you planned to edit the autonarrative and if you had other similar calculations in your analysis.

    6. (Optional) In the Default display options section, for Number of anomalies to show, set how many outliers that you want to display on the narrative widget. You can still explore all them, no matter how few you choose to show in the autonarrative.

    7. (Optional) In the Default display options section, for Sorting method, choose the method that you want to apply. Some of these methods are based on the anomaly score that QuickSight generates. QuickSight gives higher scores to data points that look anomalous. You can use any of the following options:

      • Weighted anomaly score – The anomaly score multiplied by the log of the absolute value of the difference between the actual value and the expected value. This score is always a positive number.

      • Anomaly score – The actual anomaly score assigned to this data point.

      • Weighted difference from expected value – (Default) The anomaly score multiplied by the difference between the actual value and the expected value.

      • Difference from expected value – The actual difference between the actual value and the expected value (actual−expected).

      • Actual value – The actual value with no formula applied.

    8. (Optional) In the Default display options section, for Minimum Delta, enter a custom value that you want to use to identify anomalies. Any amount higher than the threshold value counts as an anomaly. The values you enter here change how the insight works in your analysis. In this section, you can set the following:

      • Absolute value – The actual value that you want to use. For example, if you set this to 48, QuickSight identifies values as anomalous when the difference between a value and the expected value is greater than 48.

      • Percentage – The percentage threshold that you want to use. For example, if you set this to 12.5%, QuickSight identifies values as anomalous when the difference between a value and the expected value is greater than 12.5%

      You can change the Minimum Delta in the sheet controls at the top of the Explore Anomalies screen. Use the ENTER (or RETURN) key to refresh the screen with the new values.

  5. Choose Save to confirm you choices. Choose Cancel to exit without saving. Choose Delete to remove this insight widget from your analysis.

    To reopen the configuration screen, choose the v-shaped on-visual menu, then choose Configure anomaly.

  6. Choose Run now to run the anomaly detection and view your insight.

    The amount of time that it takes to complete varies depending on how many unique data points you are analyzing. The process can take a few minutes for a minimum number of points, or it can take many hours. While it's running in the background, you can do other work in your analysis. But you should wait for it to complete before you change the configuration, edit the narrative, or open the Explore anomalies page for this insight.

  7. (Optional) Refresh the page to see the latest information.

    The insight widget has several states, depending on when it last ran. If you think the status might be out of date, you can refresh the page. You can determine the state of the insight by using the following information.

    Appears on the Page Status
    Run now button The job has not yet started.
    Message about Analyzing for anomalies The job is currently running.
    Narrative about the detected anomalies (outliers) The job has run successfully. The message says when this widget's calculation was last updated.
    Alert icon with an exclamation point ( ! ) This icon indicates there was an error during the last run. If the narrative also displays, you can still use Explore anomalies to use data from the previous successful run.
  8. (Optional) Remove the anomaly detection by choosing the v-shaped on-visual menu, then choosing Delete.