Centralized network security for VPC-to-VPC and on-premises to VPC traffic - Building a Scalable and Secure Multi-VPC AWS Network Infrastructure

Centralized network security for VPC-to-VPC and on-premises to VPC traffic

AWS provides security groups and subnet NACLS to implement network security within your Landing Zone. These are layer 4 firewalls. There may be scenarios where a customer wants to implement a layer 7 firewall/IPS/IDS within their Landing Zone to inspect traffic flowing between VPCs or between an on-premises data center and a VPC. This can be achieved using Transit Gateway and third-party software appliances running on EC2 instances. Using the architecture in Figure 14, we can enable VPC to VPC and on-premises to VPC traffic to flow via the EC2 instances. The setup is similar to what we have already discussed in Figure 12, but additionally we remove the blackhole route in Route Table 1 to allow intern VPC traffic flow and attach the VPN attachment and/or Direct Connect GW attachment to route table 1to allow hybrid traffic flow. This enables all traffic coming from the spokes to flow to the egress VPC before being sent to the destination. You need static routes in the egress VPC subnet route table (where the firewall EC2 appliances reside) for sending traffic destined to spoke VPCs and on-premises CIDR through Transit Gateway after traffic inspection. 

Note

Route information is not dynamically propagated from Transit Gateway into the subnet route table and must be statically entered. There is a soft limit of 50 static routes on a subnet route table.

Figure 14 – VPC-to-VPC and VPC-on-premises traffic control

Key considerations when sending traffic to EC2 instances for in-line inspection:

  • Additional Transit Gateway data processing charges

  • Traffic must go through two additional hops (EC2 instance and Transit Gateway)

  • Potential for bandwidth and performance bottlenecks

  • Additional complexity of maintaining, managing, and scaling EC2 instances:

    • Detecting failure and failing over to standby

    • Tracking usage and horizontal/vertical scaling

    • Firewall configuration, patch management

    • Source Network Address Translation (SNAT) of traffic when load balancing to guarantee symmetric flow

You should be selective in what traffic passes via these EC2 instances. One way to proceed is to define security zones and inspect traffic between untrusted zones. An untrusted zone can be a remote site managed by a 3rd party, a vendor VPC you don’t control/trust, or a sandbox/dev VPC, which has more relaxed security framework compared to rest of your environment. Figure 15 enables direct traffic flow between trusted networks while inspecting traffic flow to/from untrusted networks using in-line EC2 instances. We created three zones in this example:

  • Untrusted Zone — This is for any traffic coming from the ‘VPN to remote untrusted site’ or the 3rd party vendor VPC.

  • Prod Zone — This contains traffic from the production VPC and on-premises customer DC.

  • Dev Zone —This contains traffic from the two development VPCs.

The following are sample rules we define for communication across zones:

  1. Untrusted Zone Prod Zone - Communication not allowed

  2. Prod Zone Dev Zone - Communication allowed via EC2 FW appliances in egress VPC

  3. Untrusted Zone Dev Zone - Communication allowed via EC2 FW appliances in egress VPC

  4. Prod Zone Prod Zone and Dev Zone Dev Zone – Direct communication via Transit Gateway

This is a setup has three security zones, but you might have more. You can use multiple route tables and blackhole routes to achieve security isolation and optimal traffic flow. Choosing the right zones is dependent on your overall Landing Zone design strategy (account structure, VPC design). You can have zones to enable isolation between BU, applications, environments, etc.

In this example, we terminate the untrusted remote VPN on Transit Gateway and send all traffic to software FW appliances on EC2 for inspection. Alternatively, you can terminate these VPNs directly on the EC2 instances instead of Transit Gateway. With this approach, the untrusted VPN traffic never directly interacts with Transit Gateway. The number of hops in the traffic flow reduces by 1, and you save on AWS VPN costs. To enable dynamic route exchanges (for Transit Gateway to learn the CIDR of the remote VPN via BGP), the firewall instances should be connected to Transit Gateway via VPN. In the native TGW attachment model, you must add static routes in the TGW route table for VPN CIDE with the next hop as the egress/security VPC. In our setup (Figure15), we have a default route to egress VPC for all traffic so we don’t have to explicitly add any specific static routes. With this approach you move away from a fully managed Transit Gateway VPN termination endpoint to a self-managed EC2 instance, adding VPN management overhead as well as additional load on the EC2 instance in terms of compute and memory.

Figure 15 – Traffic isolation using Transit Gateway and defining security Zones