Why Choose an S3 Bucket over CloudWatch Logs? – Logging and Monitoring – SCS-C02 Study Guide

Why Choose an S3 Bucket over CloudWatch Logs?

When enabling your VPC Flow Logs to capture the traffic, you have an immediate choice of where to store the logs. The industry that your organization is in, along with the types of compliance regulations you need to follow, should play a part in this decision.

There are pros and cons to each storage system. This section will first cover storing logs in S3 from the view of a security professional. As a security professional, you are often tasked directly or indirectly with gathering system logs and ensuring that they are ready for an audit. These audits may happen on a regular basis, such as an annual security audit, or they may occur randomly. You may be subject to data retention regulations based on the industry you work in for storing that log data for 1, 2, 6, or even 10 or more years. This exact scenario is where storing the log files on the S3 service comes into play. Some characteristics have made S3 a popular service for over a decade, such as the 11 9s of reliability, low-cost storage, and the ability to create life cycle policies that move files seamlessly from one storage class to another so that cost savings can be maximized. Because of these characteristics, S3 works well for large files and large quantities of files, such as in the case of the log files that you collect.

The alternate option is to use CloudWatch Logs. This is often a much easier option, as this can be a one-click setup with numerous AWS services and will automatically group logs by day, week, month, and year. However, the ease of CloudWatch Logs does come with a set of limitations, including the fact that using CloudWatch Logs for extended storage can become a cost burden on your organization.

Further, while CloudWatch Logs does have native search functionality built in, when performing more sophisticated queries, CloudWatch starts to reach its threshold in usefulness.

Figure 7.3: Exporting CloudWatch Logs to an S3 bucket via Kinesis Data Firehose

The following section teaches you how to enable VPC Flow Logs on your account so that you can start to examine them.

Enabling VPC Flow Logs

Before you can parse or read any VPC Flow Logs, you must enable them on one of your VPCs. You could use the AWS Management Console or a software development kit (SDK), but for this example, use the Amazon CLI:

  1. Open your terminal window to access your CLI.
  2. Start by making a bucket for your logs to be saved into. Remember, your bucket name needs to be unique across all of AWS, and the bucket name shown in the following example (packt-security-logs) must be replaced with your bucket name. (NOTE: If you made a bucket for logging your access logs in the previous exercise, you can skip this step and use the same bucket.)

aws s3 mb s3://packt-security-logs

  • With a place to store your logs, you now need to find the ID of your default VPC. Do this with the following command:

aws ec2 describe-vpcs –region us-east-2 –output text

This will give you the output you are looking for in a compact form, vpc-x00x000, with some other information, such as the statement that defaults to True. With these two pieces of information, you can turn on the VPC Flow Logs.

  • But first, check to make sure that they are not already on with the following command:

aws ec2 describe-flow-logs –region us-east-2

  • You are ready to turn on the flow logs without output from your query. Do that with this command:

aws ec2 create-flow-logs \

–resource-type VPC \

–resource-ids vpc-f80e0490 \

–traffic-type ALL \

–log-destination-type s3 \

–log-destination arn:aws:s3:::packt-security-logs/vpc-flow-logs/ \

–max-aggregation-interval 60 \

–region us-east-2

And with that, your VPC Flow Logs have been turned on. You will have received a FlowLogId value and a CreateFlowLogs ClientToken back on the command line signaling that the command was successful.

One thing of significance to point out in the command is the –traffic-type ALL flag. When you are capturing VPC Flow Logs, you have three choices for the traffic you are capturing:

  • The ALL option captures both accepted and rejected traffic in the logs.
  • The REJECT option captures only traffic that was rejected from the VPC (i.e., people and bots trying for ports that aren’t open or users trying to access a page that isn’t available).
  • The ACCEPT option captures only the traffic accepted on the network and passed on from source to destination.

Now that your logs are turned on, you can learn where to retrieve them and examine the values contained in the log files.