Embed the Honeypot link in your web application (optional) - Security Automations for AWS WAF

Embed the Honeypot link in your web application (optional)

If you chose yes for the Activate Bad Bot Protection parameter in Step 1. Launch the stack, the CloudFormation template creates a trap endpoint to a low-interaction production honeypot. This trap is intended to detect and divert inbound requests from content scrapers and bad bots. Valid users won’t attempt to access this endpoint.

However, content scrapers and bots, such as malware that scans for security vulnerabilities and scrapes email addresses, might attempt to access the trap endpoint. In this scenario, the Access Handler Lambda function inspects the request to extract its origin, and then update the associated AWS WAF rule to block subsequent requests from that IP address.

Use one of the following procedures to embed the honeypot link for requests from either a CloudFront distribution or an ALB.

Create a CloudFront Origin for the Honeypot Endpoint

Use this procedure for web applications that are deployed with a CloudFront distribution. With CloudFront, you can include a robots.txt file to help identify content scrapers and bots that ignore the robots exclusion standard. Complete the following steps to embed the hidden link and then explicitly disallow it in your robots.txt file.

  1. Sign in to the AWS CloudFormation console.

  2. Choose the stack that you built in Step 1. Launch the stack

  3. Choose the Outputs tab.

  4. From the BadBotHoneypotEndpoint key, copy the endpoint URL. It contains two components that you need to complete this procedure:

    • The endpoint host name (for example, xxxxxxxxxx.execute-api.region.amazonaws.com)

    • The request URI (/ProdStage)

  5. Sign in to the Amazon CloudFront console.

  6. Choose the distribution that you want to use.

  7. Choose Distribution Settings.

  8. On the Origins tab, choose Create Origin.

  9. In the Origin Domain Name field, paste the host name component of the endpoint URL that you copied in Step 2. Associate the Web ACL with your web application.

  10. In Origin Path, paste the request URL that you also copied in Step 2. Associate the Web ACL with your web application.

  11. Accept the default values for the other fields.

  12. Choose Create.

  13. On the Behaviors tab, choose Create Behavior.

  14. Create a new cache behavior and point it to the new origin. You can use a custom domain, such as a fake product name that’s similar to other content in your web application.

  15. Embed this endpoint link in your content pointing to the honeypot. Hide this link from your human users. As an example, review the following code sample:

    <a href="/behavior_path" rel="nofollow" style="display: none" aria-hidden="true">honeypot link</a>
    Note

    It’s your responsibility to verify what tag values work in your website environment. Don’t use rel="nofollow" if your environment doesn’t observe it. For more information about robots meta tags configuration, refer to the Google developer's guide.

  16. Modify the robots.txt file in the root of your website to explicitly disallow the honeypot link, as follows:

    User-agent: <*> Disallow: /<behavior_path>

Use this procedure for web applications that are deployed with an ALB.

  1. Sign in to the AWS CloudFormation console.

  2. Choose the stack that you built in Step 1. Launch the stack.

  3. Choose the Outputs tab.

  4. From the BadBotHoneypotEndpoint key, copy the endpoint URL.

  5. Embed this endpoint link in your web content. Use the full URL that you copied in Step 2. Associate the Web ACL with your web application. Hide this link from your human users. As an example, review the following code sample:

    <a href="<BadBotHoneypotEndpoint value>" rel="nofollow" style="display: none" aria-hidden="true"><honeypot link></a>
    Note

    This procedure uses rel=nofollow to instruct robots to not access the honeypot URL. However, because the link is embedded externally, you can’t include a robots.txt file to explicitly disallow the link. It’s your responsibility to verify what tags work in your website environment. Don’t use rel="nofollow" if your environment doesn’t observe it.