Create Feature Attribute Baselines and Explainability Reports - Amazon SageMaker

Create Feature Attribute Baselines and Explainability Reports

For an example notebook with instructions on how to run a SageMaker Clarify processing job in Studio that creates explanations for its predictions relative to a baseline, see Explainability and bias detection with Amazon SageMaker Clarify.

If you need instructions on how to open a notebook in Amazon SageMaker Studio, see Create or Open an Amazon SageMaker Studio Notebook. The following code examples are taken from the example notebook listed previously. This section discusses the code related to the use of Shapley values to provide reports that compare the relative contributions each feature made the predictions.

Use SHAPConfig to create the baseline. In this example, the mean_abs is the mean of absolute SHAP values for all instances, specified as the baseline. You use DataConfig to configure the target variable, data input and output paths, and their formats.

shap_config = clarify.SHAPConfig(baseline=[test_features.iloc[0].values.tolist()], num_samples=15, agg_method='mean_abs') explainability_output_path = 's3://{}/{}/clarify-explainability'.format(bucket, prefix) explainability_data_config = clarify.DataConfig(s3_data_input_path=train_uri, s3_output_path=explainability_output_path, label='Target', headers=training_data.columns.to_list(), dataset_type='text/csv')

Kernel SHAP in SageMaker Clarify supports omitting the “baseline” parameter. In this case, a baseline based on clustering the input dataset is generated automatically.

Then run the explainability job.

clarify_processor.run_explainability(data_config=explainability_data_config, model_config=model_config, explainability_config=shap_config)

To view Partial Dependence plots (PDP), use explainability_config=PDP_config.

You can select both types of reports with explainability_config=[PDP_config,shap_config].

View the results in Studio or download them from the explainability_output_path S3 bucket.