Many AWS services record user access and export the data as log files for you to process using backend services. These log files allow you to monitor usage patterns, collect baselines, troubleshoot, and validate any security or compliance mandates your organization may operate under. Analytic services can be used to gain valuable insights into your operations by searching your access logs. In this section, we will discuss the access logs that are specific to networking including load balancing, content distribution, and DNS services.
ELB access logs allow you to record and store logging data for every HTTP/HTTPS and TCP request processed by your elastic load balancer. The logs are in plaintext format and are generated as one line per log entry. Logging can be enabled or disabled at any time. These access logs are stored in an S3 bucket you define. Once the data is collected, you can then use AI, analysis, standard logging search tools, or third-party vendor applications to gain insights into your collected ELB access data. Collected data is combined from all availability zones that the ELB is configured for. Data analysis may include the source IP addresses, target server response metrics, and traffic flow from source to target and the return flow. Access logs are disabled by default and can be activated in the LB console, CLI, or with an API call. When configuring access logging, you will need to specify the S3 bucket to store the logs and the desired prefix in the bucket. If you have a very busy site, remember that there can be a large amount of logging data created, and there will be associated charges in S3 to store that data. The logs are collected and exported to CloudWatch at specified time intervals. If a log grows too large before the time interval is complete, multiple logs are generated. So, on busy sites, there may be multiple logs generated for each time interval. It is a good idea to configure log retention time frames and policies; for example, use storage life-cycle rules to migrate older logs to the lower-cost Glacier storage service.
The logging is useful for setting baselines of your deployment and to identify impairments or bottlenecks along the flow path such as an EC2 web server that is responding slowly.
Logs are stored in S3 based on your data retention requirements. It is a best practice to use the S3 life-cycle manager to manage and reduce storage costs. The log file naming convention contains the IP address of the load balancer, AWS account number, load balancer’s name and region, the date YYYY/MM/DD and a timestamp of the end of the logging interval, and a random number (to handle multiple log files for the same time interval). Logs are sent in intervals of 15 to 60 minutes.
See Figure 5.9 for the CloudWatch log displays for a network load balancer.