In a distributed environment, the performance you get from the network layer strongly influences the overall performance of your workload. It is thus paramount to select the right network architecture. As for the rest, everything starts from your workload requirements.
Consider your infrastructure requirements:
For network connectivity between AWS and any other central location, such as your own on-premises infrastructure or a third-party hosted infrastructure, you have the choice between using the public internet or using a private connection (whether shared or dedicated). You can secure communication over the public internet by using a VPN connection on top of it, but if you need large network bandwidth and consistent throughput, AWS Direct Connect (DX) is the way to go. Whichever option you pick (VPN or DX), make sure to adequately size your network bandwidth to support data transfer in and out of AWS. In the case of VPN, remember that each VPN connection is limited to 1.25 Gbps, so if you need more bandwidth, you should consider using Equal Cost Multi Path (ECMP) on AWS Transit Gateway (TGW), setting up multiple parallel VPN connections and adding up their bandwidth. Now, if you plan to interconnect multiple VPCs to each other, use a highly scalable and fully managed service such as TGW to build a scalable and performant AWS network infrastructure. For more details on network connectivity, please refer to Chapter 2, Designing Networks for Complex Organizations.
Then, consider your end users:
If your workload delivers content through a web frontend to end users across multiple geographies, consider leveraging CloudFront. Firstly, it will lower the latency they experience when accessing the content by caching it at the edge; secondly, it will also reduce the requests to your origin servers to get that content. If your end users are not geographically dispersed, they could still benefit from the reduced latency unless they are located in a country where AWS owns Regions, AZs, or Local Zones.
When your users are widespread but, for some reason, you prefer not to distribute your workload across the globe and instead want to run it in a single region, for instance, you have the option to leverage a service such as AWS Global Accelerator to optimize the network path. As illustrated in the following figure, Global Accelerator reduces the number of network hops needed on the internet by entering the AWS backbone at the closest edge location and then navigating to the Region where your workload is deployed and which is the closest to the end user location. It provides a set of static IP addresses (and yes, you can bring your own if you want to) that serve as entry points for your workload and are announced by multiple edge locations at the same time. Within your AWS environment, you can associate those IP addresses with your resources or endpoints deployed in any AWS Region, such as elastic load balancers (classic, ALB, or NLB), EC2 instances, or Elastic IP Addresses (EIPs). Then, either you let Global Accelerator determine the optimal network path from the edge to a healthy resource, or you define your own custom routes, which can be useful if you need to maintain connectivity between a set of end users and the same group of AWS resources over multiple requests, for instance, for gaming or Voice over IP (VoIP) sessions.
Figure 8.3: Side-by-side visualization: with and without AWS Global Accelerator
Similarly, Amazon Route 53, an AWS DNS service, can also help reduce latency with its built-in latency-based routing mechanism. When you deploy your workload in multiple Regions, latency-based routing is an efficient way to route traffic to your AWS resources, presenting the shortest network round-trip time from the end user’s location. Route 53 uses its data on latency to reach the various Regions and then routes the end user’s request to your AWS resources, such as load balancers (classic, ELB, or NLB), EC2 instances, or EIPs, in the region offering the lowest latency.