Storage Selection – Meeting Performance Objectives – SAP-C02 Study Guide

Storage Selection

Selecting the optimal storage for your solution depends on multiple factors. First, how do you plan to access the storage? Do you require block-level, file-level, or object-level access? Have you identified the storage access patterns your solution requires (random or sequential)? How much throughput do you need? What is the frequency of access (online, offline, archival)? What is the storage update required ? Is it of the Write-Once-Read-Many (WORM) type, or is it more dynamic? Are there any availability and durability constraints?

There are multiple types of storage solutions available on AWS, so selecting the right storage solution for the right usage is paramount to optimizing performance. And, most likely, your solution will leverage multiple storage solutions according to its needs.

The following presents a brief overview of the various storage options on AWS.

Amazon Elastic Block Store (EBS) delivers block-level storage for EC2 instances. It offers storage volumes backed either by Solid State Drives (SSDs), for very low-latency and Input/Output Per Second (IOPS)-intensive workloads, or Hard Disk Drives (sHDDs), for throughput-intensive workloads. EBS volumes can be attached to an EC2 instance and then used to support a filesystem, a database, or anything else that can make use of a block device (think of it as a hard drive). EBS volumes come in different flavors, and it’s important to understand their characteristics before picking one up for your workload:

  • General Purpose SSD volumes (gp2, gp3) provide a good balance between price and performance for many workloads. They are your best bet to get started when your workload does not have a behavior that directly points to one of the other types mentioned in this list. They can be used for anything really, from filesystems to databases.
  • Provisioned IOPS SSD volumes (io1, io2, io2 Block Express) are designed for I/O-intensive workloads. They provide consistent IOPS and can predictably scale to tens of thousands of IOPS per volume.
  • Throughput Optimized HDD volumes (st1) provide low-cost magnetic storage for throughput-intensive workloads. This makes st1 volumes ideal for workloads accessing large volumes of data sequentially, such as big data, Extract, Transform, and Load (ETL), data warehouses, and log processing.
  • Cold HDD volumes (sc1) offer even lower-cost magnetic storage for throughput-intensive workloads manipulating cold data. Providing a lower throughput than st1 volumes, sc1 volumes are ideal for workloads requiring infrequent access to large volumes of data sequentially when keeping costs low is the driving factor.

As for all things on AWS, you should monitor the performance of your workload and the components it uses. Leverage EBS metrics such as VolumeTotalReadTime, VolumeTotalWriteTime, VolumeReadOps, and VolumeWriteOps, as well as VolumeQueueLength, to understand whether your workload suffers from storage latency. Check VolumeReadOps and VolumeWriteOps to make sure you stay below the IOPS limit for your EBS volume. These metrics should give you an initial understanding of whether you are making the most of your EBS volume performance. If you are too close to your IOPS limit or, even worse, get throttled because you reach it repeatedly, you either need to limit the IOPS of your workload or increase your volume IOPS limit. The latter will require you to either increase the EBS volume size, unless you have already reached the max IOPS supported by your specific EBS volume type, or possibly to change the EBS volume type, for instance, from a gp2 to a gp3 volume type, or even to an io2 volume type depending on the situation.

Likewise, monitor VolumeReadBytes and VolumeWriteBytes to make sure that you get enough throughput from your EBS volumes, and understand whether you might benefit from either a larger volume or a different type with higher throughput.