Backups and Snapshots – Backup and Restore Strategies – SOA-C02 Study Guide

Backups and Snapshots

Those services that do need to be backed up have a few options. If the service runs on an EBS volume, a point-in-time snapshot of that EBS volume can be taken. All snapshots in AWS are incremental. This means that each snapshot captures only the blocks that have changed since the last snapshot. If this is the initial snapshot, all blocks of the volume that contain data are recorded, of course. Snapshots on some services such as EC2, RDS, ElastiCache, and Redis also ensure consistency, but in others you need to ensure a consistent state is created on the volume before you execute a snapshot.

These two snapshot modes of operation can be represented by the differences in the EBS volume and RDS database snapshots. Because the RDS service is fully managed, AWS manages the operating system of the database server instance. When a snapshot request is issued, the RDS service can ensure that all writes are momentarily paused, all data currently still in memory is committed to disk, and a consistent state is created. This ensures any snapshot will always have usable data on the volume. RDS also allows you to automate backups and direct the RDS service to retain the automated snapshots for up to 35 days. Manual snapshots can be initiated at any time by each customer and will be retained indefinitely.

The same is not true for the EBS service. When an EBS volume is attached to the EC2 instance, AWS does not have access into the operating system of the instance. The customer is in full control of the instance and must ensure that if a snapshot is taken of the EBS volume, writes to the selected EBS volume must be momentarily paused, and all data in memory is committed to disk before the snapshot is started. This can easily be done with a script that runs internally within the EC2 operating system.

Although snapshots provide a way for a volume-based service to be backed up, using them is not always an option. Some applications might require a backup of multiple components to take place at the same time, ensuring a distributed state of the application is captured exactly. In this case, you can deploy a traditional agent-based backup solution deployed in the operating system or use the AWS Backup service, which we discuss later in this chapter.

Finally, there are those services that do not support volume-based snapshots. Shared file systems like EFS and FSx, DynamoDB tables, and S3 buckets are not accessible via volumes because they are distributed across a set of AWS managed instances. For these services, you need to employ either a built-in strategy or a time-based replication of the service contents to a separate location.

With DynamoDB, the built-in solution is table backups. A table backup is a point-in-time copy of the DynamoDB table that can be used to restore both the complete table and specific items. The backup schedule can be automated, and up to 35 days of retention is available.

In contrast, S3 does not have a backup solution; instead, you can version objects; this means that when an object is re-uploaded and has changed, a complete copy of that object is created in S3 with an incremented version identifier. S3 is also highly durable and is able to life-cycle data to Glacier. This means that S3 itself can also be used as a backup solution. We discuss S3 as a backup solution and versioning later in this chapter.

With EFS/FSx, things get a bit trickier. The file system is controlled by the tenant, and the service is managed by AWS. Although AWS ensures that the service is as highly available as possible, it does not have any access into the volume and does not guarantee any durability of the data, like with S3. This means that you are required to select and maintain a backup scenario for EFS/FSx. One way to back up the file system is to create a copy of the file system contents to S3. In this scenario, you can use the AWS DataSync service because it can incrementally copy any changes to S3, where you can enable versioning and life-cycling of old versions into Glacier. You can also choose the AWS Backup service to perform backups of the file system, which we discuss later in this chapter.