CloudWatch and CloudWatch Metrics – SCS-C02 Study Guide

CloudWatch and CloudWatch Metrics

In the previous chapter, you looked at the different types of log files AWS can generate. This chapter will focus on the CloudWatch service. Amazon CloudWatch is the leading monitoring service used in AWS and cloud data and metrics from all supported AWS services. It allows you to gain a better understanding of the performance of your environment. CloudWatch lets you collect valuable logging information from many different services, such as EC2 instances and Route 53, and even has the capability to collect and store CloudTrail logs.

In addition, CloudWatch has built-in metric monitoring and reporting capabilities with CloudWatch Metrics. Metrics can be gathered and used in multiple formats, such as creating alarms to help notify your security team when certain thresholds are breached (such as too many log-in attempts during a specific time period) or alarms for other groups depending on their needs. Dashboards can also be created to graphically present the metrics, which further helps easily visualize what happens with a particular service or metric over a short or extended period.

Finally, the chapter will wrap things up with a review of Amazon EventBridge, the processor of CloudWatch Events. Having an understanding of the aforementioned event-driven architectures (EDAs), especially in the context of security, can help you automate your responses to various events detected by CloudWatch alarms and other AWS services. This leads to faster response times and allows manual intervention in other tasks.

The following main topics will be covered in this chapter:

  • Using and searching CloudWatch Logs
  • The CloudWatch Logs agent
  • Basic metrics provided by the services and creating custom metrics
  • Amazon EventBridge overview
  • EventBridge rules and templates

Technical Requirements

You will need to have access to the AWS Management Console with an active account and AWS CLI access for this chapter.

CloudWatch Overview

Amazon CloudWatch is the de facto AWS native service used to help you monitor your services and resources. While other services may help with monitoring specific tasks such as networking or security, CloudWatch considers the services holistically. The primary function of CloudWatch is to help you monitor and track the performance of your AWS workloads, services, and applications.

When working with your systems, especially during peak periods of traffic, you never know what to expect. You need to have visibility into your overall system along with individual components in case the response times start to become sluggish or unresponsive. Applications, and correspondingly their requirements, including security requirements, are becoming more complex. The number of different platforms being used is constantly evolving, and logs are constantly being generated from different sources. Through all of this, you need a way to keep an eye on your systems. The preceding aspects represent a tiny sliver of the issues, such as performance or security-related issues, that the CloudWatch service tackles as it helps you monitor your applications and environment.

For example, consider a company that runs an e-commerce application that also stores sensitive data. As the security engineer, your top priority is to ensure the security of the data stored on the AWS platform and prevent unauthorized access. If you notice an unusual number of failed login attempts from the main site’s login page, your company might be undergoing a brute-force attack. You can use the features of CloudWatch to help monitor and proactively solve this issue using the following features it offers:

  • Monitoring log data: CloudWatch Logs can be set up to capture and centralize log files from your application servers, including login-related events.
  • Log metrics and filters: You can use the AWS CloudWatch service to filter the log files by failed login attempts and then create customized metrics based on the number of these events. You would define the filter pattern that captures entries with the failed login keywords.
  • Threshold alarms: After setting up your filters and metrics, you can configure a CloudWatch alarm if the number of failed login attempts exceeds a certain threshold within a specific time period, for example, 45 failed logins in 5 minutes.
  • Notifications: Although notifications themselves are part of Simple Notification Service (SNS), CloudWatch alarms can couple with SNS to send out a notification if the threshold has been breached.

Figure 8.1: The features of AWS CloudWatch

When monitoring systems, the CloudWatch service either uses alarms to notify team members through SNS or uses the events to trigger automated responses to downstream targets, as you will see later in this chapter. The CloudWatch service consists of four main components: metrics, alarms, logs, and events.