In the previous chapter, you looked at the different types of log files AWS can generate. This chapter will focus on the CloudWatch service. Amazon CloudWatch is the leading monitoring service used in AWS and cloud data and metrics from all supported AWS services. It allows you to gain a better understanding of the performance of your environment. CloudWatch lets you collect valuable logging information from many different services, such as EC2 instances and Route 53, and even has the capability to collect and store CloudTrail logs.
In addition, CloudWatch has built-in metric monitoring and reporting capabilities with CloudWatch Metrics. Metrics can be gathered and used in multiple formats, such as creating alarms to help notify your security team when certain thresholds are breached (such as too many log-in attempts during a specific time period) or alarms for other groups depending on their needs. Dashboards can also be created to graphically present the metrics, which further helps easily visualize what happens with a particular service or metric over a short or extended period.
Finally, the chapter will wrap things up with a review of Amazon EventBridge, the processor of CloudWatch Events. Having an understanding of the aforementioned event-driven architectures (EDAs), especially in the context of security, can help you automate your responses to various events detected by CloudWatch alarms and other AWS services. This leads to faster response times and allows manual intervention in other tasks.
The following main topics will be covered in this chapter:
You will need to have access to the AWS Management Console with an active account and AWS CLI access for this chapter.
Amazon CloudWatch is the de facto AWS native service used to help you monitor your services and resources. While other services may help with monitoring specific tasks such as networking or security, CloudWatch considers the services holistically. The primary function of CloudWatch is to help you monitor and track the performance of your AWS workloads, services, and applications.
When working with your systems, especially during peak periods of traffic, you never know what to expect. You need to have visibility into your overall system along with individual components in case the response times start to become sluggish or unresponsive. Applications, and correspondingly their requirements, including security requirements, are becoming more complex. The number of different platforms being used is constantly evolving, and logs are constantly being generated from different sources. Through all of this, you need a way to keep an eye on your systems. The preceding aspects represent a tiny sliver of the issues, such as performance or security-related issues, that the CloudWatch service tackles as it helps you monitor your applications and environment.
For example, consider a company that runs an e-commerce application that also stores sensitive data. As the security engineer, your top priority is to ensure the security of the data stored on the AWS platform and prevent unauthorized access. If you notice an unusual number of failed login attempts from the main site’s login page, your company might be undergoing a brute-force attack. You can use the features of CloudWatch to help monitor and proactively solve this issue using the following features it offers:
Figure 8.1: The features of AWS CloudWatch
When monitoring systems, the CloudWatch service either uses alarms to notify team members through SNS or uses the events to trigger automated responses to downstream targets, as you will see later in this chapter. The CloudWatch service consists of four main components: metrics, alarms, logs, and events.