Monitoring Services in AWS – SOA-C02 Study Guide

This chapter covers the following official AWS Certified SysOps Administrator – Associate (SOA-C02) exam domains:

Domain 1: Monitoring, Logging, and Remediation

Domain 4: Security and Compliance

(For more information on the official AWS Certified SysOps Administrator – Associate [SOA-C02] exam topics, see the Introduction.)

Having a good understanding of what your system is doing in any particular moment is a crucial part of making that system operate in a reliable manner. Traditionally, monitoring meant establishing a system or network operating center (usually abbreviated as SOC and NOC, respectively). That would mean having a room full of screens, the screens being monitored by trained response technicians, who would in turn respond to any anomaly displayed on-screen.

There are two problems with the traditional SOC/NOC design:

Focus primarily on metrics: Observing metrics of the system being monitored implies that you can usually catch issues only after they impact the system in question.

Human error: Because the monitoring is being done by humans, the platform is highly prone to human errors, such as misidentifying the issue, misinterpreting data, or simply missing the issue entirely.

Most modern monitoring platforms are designed to be highly programmable in nature and thus can be designed to take care of (most) issues through automation. This approach is sometimes referred to as self-healing.

When designed properly, these systems seldomly ever alert anyone that an action needs to be taken. If a system does get into a condition requiring human intervention (for example, when metrics are out of scope of self-healing), the platform can send out notifications through many different channels, and the response can be very fast.

Modern platforms also provide the capability to track and analyze logs, which enables you to maintain the compliance and security of your system as well as perform remediation preemptively and thus avoid any kind of impact of any issues to the application.

Metering, Monitoring, and Alerting

This section covers the following official AWS Certified SysOps Administrator – Associate (SOA-C02) exam domains:

Domain 1: Monitoring, Logging, and Remediation

Domain 4: Security and Compliance

CramSaver

If you can correctly answer these questions before going through this section, save time by skimming the Exam Alerts in this section and then completing the Cram Quiz at the end of the section.

1. What are some of the characteristics of modern monitoring platforms?

2. Which factors would you consider when assessing a monitoring system’s capability to increase security and maintain compliance of your application?

Answers

1. Answer: The capability to meter and collect metrics and logs; the capability to view, graph, and analyze the metrics and log captures; and the capability to trigger alerts, send notifications, trigger actions, and interact with other systems.

2. Answer: Any monitoring platform should have the capability to capture and analyze logs from which you can extract information on login attempts, network access sources and targets, actions being performed within the application, and so on.

We like to tell students, customers, and peers that operating an environment without monitoring is like flying an airplane with sunglasses on, at night, without instrumentation, and no auto pilot. You might be able to estimate what is going on outside and keep flying; however, eventually you are going to crash.

A good understanding of the state of your application is crucial. Not only do you get information on the current performance of the application, but metering, monitoring, and alerting also should be considered essential tools in your troubleshooting, remediation, and security practices. Additionally, you learn how cloud resources are used and thus enable cost optimization as well.