When enabling your VPC Flow Logs to capture the traffic, you have an immediate choice of where to store the logs. The industry that your organization is in, along with the types of compliance regulations you need to follow, should play a part in this decision.
There are pros and cons to each storage system. This section will first cover storing logs in S3 from the view of a security professional. As a security professional, you are often tasked directly or indirectly with gathering system logs and ensuring that they are ready for an audit. These audits may happen on a regular basis, such as an annual security audit, or they may occur randomly. You may be subject to data retention regulations based on the industry you work in for storing that log data for 1, 2, 6, or even 10 or more years. This exact scenario is where storing the log files on the S3 service comes into play. Some characteristics have made S3 a popular service for over a decade, such as the 11 9s of reliability, low-cost storage, and the ability to create life cycle policies that move files seamlessly from one storage class to another so that cost savings can be maximized. Because of these characteristics, S3 works well for large files and large quantities of files, such as in the case of the log files that you collect.
The alternate option is to use CloudWatch Logs. This is often a much easier option, as this can be a one-click setup with numerous AWS services and will automatically group logs by day, week, month, and year. However, the ease of CloudWatch Logs does come with a set of limitations, including the fact that using CloudWatch Logs for extended storage can become a cost burden on your organization.
Further, while CloudWatch Logs does have native search functionality built in, when performing more sophisticated queries, CloudWatch starts to reach its threshold in usefulness.
Figure 7.3: Exporting CloudWatch Logs to an S3 bucket via Kinesis Data Firehose
The following section teaches you how to enable VPC Flow Logs on your account so that you can start to examine them.
Before you can parse or read any VPC Flow Logs, you must enable them on one of your VPCs. You could use the AWS Management Console or a software development kit (SDK), but for this example, use the Amazon CLI:
aws s3 mb s3://packt-security-logs
aws ec2 describe-vpcs –region us-east-2 –output text
This will give you the output you are looking for in a compact form, vpc-x00x000, with some other information, such as the statement that defaults to True. With these two pieces of information, you can turn on the VPC Flow Logs.
aws ec2 describe-flow-logs –region us-east-2
aws ec2 create-flow-logs \
–resource-type VPC \
–resource-ids vpc-f80e0490 \
–traffic-type ALL \
–log-destination-type s3 \
–log-destination arn:aws:s3:::packt-security-logs/vpc-flow-logs/ \
–max-aggregation-interval 60 \
–region us-east-2
And with that, your VPC Flow Logs have been turned on. You will have received a FlowLogId value and a CreateFlowLogs ClientToken back on the command line signaling that the command was successful.
One thing of significance to point out in the command is the –traffic-type ALL flag. When you are capturing VPC Flow Logs, you have three choices for the traffic you are capturing:
Now that your logs are turned on, you can learn where to retrieve them and examine the values contained in the log files.