Amazon Rekognition
Developer Guide

Detecting Labels and Faces

Amazon Rekognition provides non-storage API operations for detecting labels and faces in an image. A label or a tag is an object, scene, or concept found in an image based on its contents. For example, a photo of people on a tropical beach may contain labels such as Person, Water, Sand, Palm Tree, and Swimwear (objects), Beach (scene), and Outdoors (concept).

These are referred to as the non-storage API operations because when you make the API call, Amazon Rekognition does not persist the input image or any image data. The API operations do the necessary analysis and return the results. The sections in this topic describe these operations.

Detecting Labels

You can use the DetectLabels API operation to detect labels in an image. For each label, Amazon Rekognition returns a name and a confidence value in the analysis. The following is an example response of the DetectLabels API call.

{ "Labels": [ { "Confidence": 98.4629, "Name": "beacon" }, { "Confidence": 98.4629, "Name": "building" }, { "Confidence": 98.4629, "Name": "lighthouse" }, { "Confidence": 87.7924, "Name": "rock" }, { "Confidence": 68.1049, "Name": "sea" } ] }

The response shows that the API detected five labels (that is, beacon, building, lighthouse, rock, and sea). Each label has an associated level of confidence. For example, the detection algorithm is 98.4629% confident that the image contains a building.

If the input image you provide contains a person, the DetectLabels operation detects labels such as person, clothing, suit, and selfie, as shown in the following example response:

{ "Labels": [ { "Confidence": 99.2786, "Name": "person" }, { "Confidence": 90.6659, "Name": "clothing" }, { "Confidence": 90.6659, "Name": "suit" }, { "Confidence": 70.0364, "Name": "selfie" } ] }


If you want facial features describing the faces in an image, use the DetectFaces operation instead.

Detecting Faces

Amazon Rekognition provides the DetectFaces operation that looks for key facial features such as eyes, nose, and mouth to detect faces in an input image. The response returns the following information for each detected face:

  • Bounding box – Coordinates of the bounding box surrounding the face.

  • Confidence – Level of confidence that the bounding box contains a face.

  • Facial landmarks – An array of facial landmarks. For each landmark, such as the left eye, right eye, and mouth, the response provides the x, y coordinates.

  • Facial attributes – A set of facial attributes, including gender, or whether the face has a beard. For each such attribute, the response provides a value. The value can be of different types such as a Boolean (whether a person is wearing sunglasses), a string (whether the person is male or female), etc. In addition, for most attributes the response also provides a confidence in the detected value for the attribute.

  • Quality – Describes the brightness and the sharpness of the face.

  • Pose – Describes the rotation of the face inside the image.

  • Emotions – A set of emotions with confidence in the analysis.

The following is an example response of a DetectFaces API call.

{ "FaceDetails":[ { "BoundingBox":{ "Height":0.18000000715255737, "Left":0.5555555820465088, "Top":0.33666667342185974, "Width":0.23999999463558197 }, "Confidence":100.0, "Landmarks":[ { "Type":"eyeLeft", "X":0.6394737362861633, "Y":0.40819624066352844 }, { "Type":"eyeRight", "X":0.7266660928726196, "Y":0.41039225459098816 }, { "Type":"nose", "X":0.6912462115287781, "Y":0.44240960478782654 }, { "Type":"mouthLeft", "X":0.6306198239326477, "Y":0.46700039505958557 }, { "Type":"mouthRight", "X":0.7215608954429626, "Y":0.47114261984825134 } ], "Pose":{ "Pitch":4.050806522369385, "Roll":0.9950747489929199, "Yaw":13.693790435791016 }, "Quality":{ "Brightness":37.60169982910156, "Sharpness":80.0 } }, { "BoundingBox":{ "Height":0.16555555164813995, "Left":0.3096296191215515, "Top":0.7066666483879089, "Width":0.22074073553085327 }, "Confidence":99.99998474121094, "Landmarks":[ { "Type":"eyeLeft", "X":0.3767718970775604, "Y":0.7863991856575012 }, { "Type":"eyeRight", "X":0.4517287313938141, "Y":0.7715709209442139 }, { "Type":"nose", "X":0.42001065611839294, "Y":0.8192070126533508 }, { "Type":"mouthLeft", "X":0.3915625810623169, "Y":0.8374140858650208 }, { "Type":"mouthRight", "X":0.46825936436653137, "Y":0.823401689529419 } ], "Pose":{ "Pitch":-16.320178985595703, "Roll":-15.097439765930176, "Yaw":-5.771541118621826 }, "Quality":{ "Brightness":31.440860748291016, "Sharpness":60.000003814697266 } } ], "OrientationCorrection":"ROTATE_0" }

Note the following:

  • The Pose data describes the rotation of the face detected. You can use the combination of the BoundingBox and Pose data to draw the bounding box around faces that your application displays.


  • The Quality describes the brightness and the sharpness of the face. You might find this useful to compare faces across images and find the best face.


  • The DetectFaces operation first detects orientation of the input image, before detecting facial features. The OrientationCorrection in the response returns the degrees of rotation detected (counter-clockwise direction). Your application can use this value to correct the image orientation when displaying the image.

  • The preceding response shows all facial landmarks the service can detect, all facial attributes and emotions. To get all of these in the response, you must specify the attributes parameter with value ALL. By default, the DetectFaces API returns only the following five facial landmarks, Pose, and Quality.

    ... "Landmarks": [ { "Y": 0.41730427742004395, "X": 0.36835095286369324, "Type": "eyeLeft" }, { "Y": 0.4281611740589142, "X": 0.5960656404495239, "Type": "eyeRight" }, { "Y": 0.5349795818328857, "X": 0.47817257046699524, "Type": "nose" }, { "Y": 0.5721957683563232, "X": 0.352621465921402, "Type": "mouthLeft" }, { "Y": 0.5792245864868164, "X": 0.5936088562011719, "Type": "mouthRight" } ] ...

  • The following illustration shows the relative location of the facial landmarks on the face returned by the DetectFaces API operation.