Building a scalable web crawling system on AWS - AWS Prescriptive Guidance

Building a scalable web crawling system on AWS

This section describes how to build the web crawler described in the Architecture section. It includes a systematic approach to creating a robust dataset of companies and their associated web properties. This dataset serves as the foundation for your crawling activities. Then, this section describes how to build an ethical web crawler in Python.