Recall Difference (RD)

The recall difference (RD) metric is the difference in recall of the model between the favored facet a and disfavored facet d. Any difference in these recalls is a potential form of bias. Recall is the true positive rate (TPR), which measures how often the model correctly predicts the cases that should receive a positive outcome. Recall is perfect for a facet if all of the y=1 cases are correctly predicted as y’=1 for that facet. Recall is greater when the model minimizes false negatives known as the Type II error. For example, how many of the people in two different groups (facets a and d) that should qualify for loans are detected correctly by the model? If the recall rate is high for lending to facet a, but low for lending to facet d, the difference provides a measure of this bias against the group belonging to facet d.

The formula for difference in the recall rates for facets a and d:

RD = TP_a/(TP_a + FN_a) - TP_d/(TP_d + FN_d) = TPR_a - TPR_d

Where:

TP_a are the true positives predicted for facet a.
FN_a are the false negatives predicted for facet a.
TP_d are the true positives predicted for facet d.
FN_d are the false negatives predicted for facet d.
TPR_a = TP_a/(TP_a + FN_a) is the recall for facet a, or its true positive rate.
TPR_d TP_d/(TP_d + FN_d) is the recall for facet d, or its true positive rate.

For example, consider the following confusion matrices for facets a and d.

Confusion Matrix for the Favored Facet a

Class a predictions	Actual outcome 0	Actual outcome 1	Total
0	20	5	25
1	10	65	75
Total	30	70	100

Confusion Matrix for the Disfavored Facet d

Class d predictions	Actual outcome 0	Actual outcome 1	Total
0	18	7	25
1	5	20	25
Total	23	27	50

The value of the recall difference is RD = 65/70 - 20/27 = 0.93 - 0.74 = 0.19 which indicates a bias against facet d.

The range of values for the recall difference between facets a and d for binary and multicategory classification is [-1, +1]. This metric is not available for the case of continuous labels.

Positive values are obtained when there is higher recall for facet a than for facet d. This suggests that the model finds more of the true positives for facet a than for facet d, which is a form of bias.
Values near zero indicate that the recall for facets being compared is similar. This suggests that the model finds about the same number of true positives in both of these facets and is not biased.
Negative values are obtained when there is higher recall for facet d than for facet a. This suggests that the model finds more of the true positives for facet d than for facet a, which is a form of bias.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Specificity difference (SD)

Difference in Acceptance Rates (DAR)