Searching faces in a collection - Amazon Rekognition

Searching faces in a collection

Amazon Rekognition lets you use an input face to search for matches in a collection of stored faces. You start by storing information about detected faces in server-side containers called "collections". Collections store both individual faces and users (several faces of the same person). Individual faces are stored as face vectors, a mathematical representation of the face (not an actual image of the face). Different images of the same person can be used to create and store multiple face vectors in the same collection. You can then aggregate multiple face vectors of the same person to create a user vector. User vectors can offer higher face search accuracy with more robust depictions, containing varying degrees of lighting, sharpness, pose, appearance, etc.

Once you've created a collection you can use an input face to search for matching user vectors or face vectors in a collection. Searching against user vectors can significantly improve accuracy compared to searching against individual face vectors. You can use faces detected in images, stored videos, and streaming videos to search against stored face vectors. You can use faces detected in images to search against stored user vectors.

To store face information, you’ll need to do the following:

  1. Create a Collection - To store facial information, you must first create (CreateCollection) a face collection in one of the AWS Regions in your account. You specify this face collection when you call the IndexFaces operation.

  2. Index Faces - The IndexFaces operation detects face(s) in an image, extracts, and stores the face vector(s) in the collection. You can use this operation to detect faces in an image and persist information about facial features that are detected into a collection. This is an example of a storage-based API operation because the service stores the face vector information on the server.

To create a user and associate multiple face vectors with a user, you'll need to do the following:

  1. Create a User - You must first create a user with CreateUser. You can improve face matching accuracy by aggregating multiple face vectors of the same person into a user vector. You can associate up to 100 face vectors with a user vector.

  2. Associate Faces - After creating the user, you can add existing face vectors to that user with the AssociateFaces operation. Face vectors must reside in the same collection as a user vector in order to be associated to that user vector.

After creating a collection and storing face and user vectors, you can use the following operations to search for face matches:

Note

Collections store face vectors, which are mathematical representations of faces. Collections do not store images of faces.

You can use collections in a variety of scenarios. For example, you might create a face collection which stores detected faces from scanned employee badge images and government issued IDs by using the IndexFaces and AssociateFaces operations. When an employee enters the building, an image of the employee's face is captured and sent to the SearchUsersByImage operation. If the face match produces a sufficiently high similarity score (say 99%), you can authenticate the employee.

Managing collections

The face collection is the primary Amazon Rekognition resource, and each face collection you create has a unique Amazon Resource Name (ARN). You create each face collection in a specific AWS Region in your account. When a collection is created, it's associated with the most recent version of the face detection model. For more information, see Model versioning.

You can perform the following management operations on a collection:

Managing faces in a collection

After you create a face collection, you can store faces in it. Amazon Rekognition provides the following operations for managing faces in a collection:

Managing users in a collection

After you store multiple face vectors from the same person, you can improve accuracy by associating all of those face vectors into one user vector. You can use the following operations to manage your users:

  • CreateUser - Operation creates a new user in a collection with a provided unique user ID.

  • AssociateUsers - Add 1 - 100 unique face IDs to a user ID. After you associate at least one face ID to a user, you can search for matches against that user in your collection.

  • ListUsers - Lists the users in a collection.

  • DeleteUsers - Deletes a user from a collection with the provided user ID.

  • DisassociateFaces - Removes one or more face IDs from a user.

Using similarity thresholds for associating faces

It’s important to ensure that faces being associated with a user are all from the same person. To help, the UserMatchThreshold parameter specifies the minimum user match confidence required for the new face to be associated with a UserID containing at least one FaceID already. This helps ensures that the FaceIds are associated with the right UserID. The value ranges from 0-100 and the default value is 75.

Guidance for using IndexFaces

The following is guidance for using IndexFaces in common scenarios.

Critical or public safety applications

  • Call IndexFaces with images which contain only one face in each image and associate the returned Face ID with the identifier for the subject of the image.

  • You can use DetectFaces ahead of indexing to verify there is only one face in the image. If more than one face is detected, re-submit the image after review and with only one face present. This prevents inadvertently indexing multiple faces and associating them with the same person.

Photo sharing and social media applications

  • You should call IndexFaces without restrictions on images that contain multiple faces in use cases such as family albums. In such cases, you need to identify each person in every photo and use that information to group photos by the people present in them.

General usage

  • Index multiple different images of the same person, particularly with different face attributes (facial poses, facial hair, etc), create a user, and associate the different faces to that user to improve matching quality.

  • Include a review process so that failed matches can be indexed with the correct face identifier to improve subsequent face matching ability.

  • For information about image quality, see Recommendations for facial comparison input images.

Searching for faces and users within a collection

After you create a face collection and store face vectors and/or user vectors, you can search a face collection for face matches. With Amazon Rekognition, you can search for faces in a collection that match:

You can use the CompareFaces operation to compare a face in a source image with faces in the target image. The scope of this comparison is limited to the faces that are detected in the target image. For more information, see Comparing faces in images.

The various Search operations seen in the following list compare a face (identified either by a FaceId or an input image) with all faces stored in a given face collection:

Using similarity thresholds to match faces

We allow you to control the results of all search operations (CompareFaces, SearchFaces, SearchFacesByImage, SearchUsers, SearchUsersByImage) by providing a similarity threshold as an input parameter.

FaceMatchThreshold, is the similarity threshold input attribute for SearchFaces and SearchFacesByImage, and it controls how many results are returned based on the similarity to the face being matched. The similarity threshold attribute for SearchUsers and SearchUsersByImage is UserMatchThreshold, and it controls how many results are returned based on the similarity to the user vector being matched. The threshold attribute is SimilarityThreshold for CompareFaces.

Responses with a Similarity response attribute value that's lower than the threshold aren't returned. This threshold is important to calibrate for your use case, because it can determine how many false positives are included in your match results. This controls the recall of your search results—the lower the threshold, the higher the recall.

All machine learning systems are probabilistic. You should use your judgment in setting the right similarity threshold, depending on your use case. For example, if you're looking to build a photos app to identify similar-looking family members, you might choose a lower threshold (such as 80%). On the other hand, for many law enforcement use cases, we recommend using a high threshold value of 99% or above to reduce accidental misidentification.

In addition to FaceMatchThreshold and UserMatchThreshold, you can use the Similarity response attribute as a means to reduce accidental misidentification. For instance, you can choose to use a low threshold (like 80%) to return more results. Then you can use the response attribute Similarity (percentage of similarity) to narrow the choice and filter for the right responses in your application. Again, using a higher similarity (such as 99% and above) reduces the risk of misidentification.