Amazon Comprehend
Developer Guide

Detect Entities Version 2

Use the DetectEntitiesV2 or StartEntitiesDetectionV2Job operation to detect the medical entities in your text. It detects entities in the following categories:






All five categories are detected by the DetectEntitiesV2 operation. The DetectPHI and StartPHIDetectionJob operations detect entities only in the PROTECTED_HEALTH_INFORMATION category. Use them when only PHI (protected health information) is required. For information about these operations, see Detect PHI .


Amazon Comprehend Medical provides confidence scores that indicate the level of confidence in the accuracy of detected entities. When you are identifying protected health information (PHI), evaluate these scores and identify the right confidence threshold for your use case. Use high-confidence thresholds in situations that require high accuracy. For certain use cases, results should be reviewed and verified by appropriately trained human reviewers. For example, don't use Amazon Comprehend Medical in patient care scenarios unless you also have trained medical professionals reviewing your results for accuracy and exercising medical judgment.

Amazon Comprehend Medical detects information in the following classes:

  • Entity: A text reference to the name of relevant objects, such as people, treatments, medications, and medical conditions. For example, ibuprofen.

  • Category: The generalized grouping to which an entity belongs. For example, ibuprofen is part of the MEDICATION category.

  • Type: The type of entity detected within a single category. For example, ibuprofen is in the GENERIC_NAME type in the MEDICATION category.

  • Attribute: Information related to an entity, such as the dosage of a medication. For example, 200 mg is an attribute of the ibuprofen entity.

  • Trait: Something that Amazon Comprehend Medical understands about an entity, based on context. For example, a medication has the NEGATION trait if a patient is not taking it.

Amazon Comprehend Medical provides the location of an entity in the input text. In the Amazon Comprehend console, it shows the location graphically. When you use the API, it shows the location by numerical offset.

Each entity and attribute includes a score that indicates the confidence level that Amazon Comprehend Medical has in the accuracy of the detection. Each attribute also has a relationship score. The score indicates the confidence level that Amazon Comprehend Medical has in the accuracy of the relationship between the attribute and its parent entity. Identify the right confidence threshold for your use case. Use high-confidence thresholds in situations that require great accuracy. Filter out data that doesn't meet the threshold.

Anatomy Category

The ANATOMY category detects references to the parts of the body or body systems and the locations of those parts or systems. The category has one entity type.


  • SYSTEM_ORGAN_SITE: Body systems, anatomic locations or regions, and body sites.


  • DIRECTION: Directional terms. For example, left, right medial, lateral, upper, lower, posterior, anterior, distal, proximal, contralateral, bilateral, ipsilateral, dorsal, ventral, and so on.

Medical Condition Category

The MEDICAL_CONDITION category detects the symptoms and diagnosis of medical conditions. The category has one entity type, one attribute, and four traits. One or more traits can be associated with a type.


  • DX_NAME: All medical conditions listed. The DX_NAME type includes present illness, reason for visit, medical history, review of systems, family history, or patient education.


  • ACUITY: Determination of disease instance, such as chronic, acute, sudden, persistent, or gradual.


  • DIAGNOSIS: A medical condition that is determined as the cause or result of the symptoms. Symptoms can be found through physical findings, laboratory or radiological reports, or any other means. Applies only to the DX_NAME type.

  • NEGATION: An indication that a result or action is negative or not being performed.

  • SIGN: A medical condition that the physician reported. Applies only to the DX_NAME type.

  • SYMPTOM: A medical condition reported by the patient. Applies only to the DX_NAME type.

Medication Category

The MEDICATION category detects medication and dosage information for the patient. The category has two entity types, seven attributes, and one trait. One or more attributes can apply to a type.


  • BRAND_NAME: The copyrighted brand name of the medication or therapeutic agent.

  • GENERIC_NAME: Non-brand name, ingredient name, or formula mixture of the medication or therapeutic agent.


  • DOSAGE: The amount of medication ordered.

  • DURATION: How long the medication should be administered.

  • FORM: The form of the medication.

  • FREQUENCY: How often to administer the medication.

  • RATE: Primarily for medication infusions or IVs, the administration rate of the medication.

  • ROUTE_OR_MODE: The administration method of a medication.

  • STRENGTH: The medication strength.


  • NEGATION: Any indication that the patient is not taking a medication.

Protected Health Information Category

The PROTECTED_HEALTH_INFORMATION category detects the patient's personal information. The category has eight entity types. For complete information about the PROTECTED_HEALTH_INFORMATION category and how it is detected, see Detect PHI .


  • ADDRESS: All geographical subdivisions of an address of any facility, named medical facilities, or wards within a facility.

  • AGE: All components of age, spans of age, or any age mentioned, including those of a patient, family members, or others. The default is in years unless otherwise noted.

  • EMAIL: Any email address.

  • ID: Social security number, medical record number, facility identification number, clinical trial number, certificate or license number, vehicle or device number, or biometric number of the patient, place of care, or provider.

  • NAME: All names. Typically, names of the patient, family, or provider.

  • PHONE_OR_FAX: Any phone, fax, or pager number. Excludes named phone numbers, such as 1-800-QUIT-NOW and 911.

  • PROFESSION: Any profession or employer that pertains to the patient or the patient's family, but not to the profession of the clinician mentioned in the note.

Test Treatment Procedure Category

The TEST_TREATMENT_PROCEDURE category detects the procedures that are used to determine a medical condition. The category contains three entity types and two attributes. One or more attributes can be related to an entity of the TEST_NAME type.


  • PROCEDURE_NAME: Interventions as a one-time action performed on the patient to treat a medical condition or to provide patient care.

  • TEST_NAME: Procedures performed on a patient for diagnostic, measurement, screening, or rating that might have a resulting value. This includes any procedure, process, evaluation, or rating to determine a diagnosis, to rule out or find a condition, or to scale or score a patient.

  • TREATMENT_NAME: Interventions performed over a span of time for combating a disease or disorder. This includes groupings of medications, such as antivirals and vaccinations.


  • TEST_VALUE: The result of a test. Applies only to the TEST_NAME entity type.

  • TEST_UNIT: The unit of measure that might accompany the value of the test. Applies only to the TEST_NAME entity type.