Important note – AWS Services for Data Storage – MLS-C01 Study Guide

Important note

As per Amazon’s docs, S3 provides read-after-write consistency for PUTs of new objects, which means that if you upload a new object or create a new object and you immediately try to read the object using its key, then you get the exact data that you just uploaded. However, for overwrites and deletes, it behaves in an eventually consistent manner. This means that if you read an object straight after the delete or overwrite operation, then you may read an old copy or a stale version of the object. It takes some time to replicate the content of the object across three Availability Zones.

A folder structure can be maintained logically by using a prefix. Take an example where an image is uploaded into a bucket, bucket-name-example, with the prefix folder-name and the object name my-image.jpg. The entire structure looks like this: /bucket-name-example/folder-name/my-image.jpg.

The content of the object can be read by using the bucket name of bucket-name-example and the key of /folder-name/my-image.jpg.

There are several storage classes offered by Amazon for objects stored in S3:

  • Standard Storage (S3 Standard): This is the storage class for frequently accessed objects and for quick access. S3 Standard has a millisecond first-byte latency and objects can be made publicly available.
  • Standard Infrequent Access (S3 Standard-IA): This option is used when you need data to be returned quickly, but not for frequent access. The object size has to be a minimum of 128 KB. The minimum storage timeframe is 30 days. If the object is deleted before 30 days, you are still charged for 30 days. Standard-IA objects are resilient to the loss of Availability Zones.
  • One Zone Infrequent Access (S3 One Zone-IA): Objects in this storage class are stored in just one Availability Zone, which makes it cheaper than Standard-IA. The minimum object size and storage timeframe are the same as Standard-IA. Objects from this storage class are less available and less resilient. This storage class is used when you have another copy, or if the data can be recreated. A One Zone-IA storage class should be used for long-lived data that is non-critical and replaceable, and where access is infrequent.
  • Amazon S3 Glacier Flexible Retrieval (formerly S3 Glacier): This option is used for long-term archiving and backup. It can take anything from minutes to hours to retrieve objects in this storage class. The minimum storage timeframe is 90 days. For archived data that doesn’t need to be accessed right away but requires the ability to retrieve extensive data sets without incurring additional charges, like in backup or disaster-recovery scenarios, S3 Glacier Flexible Retrieval (formerly known as S3 Glacier) is the perfect storage option.
  • Amazon S3 Glacier Instant Retrieval: This storage class offers cost-effective, high-speed storage for seldom-accessed, long-term data. Compared to S3 Standard-Infrequent Access, it can cut storage expenses by up to 68% when data is accessed once per quarter. This storage class is perfect for swiftly retrieving archive data like medical images, news media assets, or user-generated content archives. You can upload data directly or use S3 Lifecycle policies to move it from other S3 storage classes.
  • Glacier Deep Archive: The minimum storage duration of this class is 180 days. This is the least expensive storage class and has a default retrieval time of 12 hours.
  • S3 Intelligent-Tiering: This storage class is designed to reduce operational overheads. Users pay a monitoring fee and AWS selects a storage class between Standard (a frequent-access tier) and Standard-IA (a lower-cost, infrequent-access tier) based on the access pattern of an object. This option is designed for long-lived data with unknown or unpredictable access patterns.

Through sets of rules, the transition between storage classes and deletion of the objects can be managed easily and are referred to as S3 Lifecycle configurations. These rules consist of actions. These can be applied to a bucket or a group of objects in that bucket defined by prefixes or tags. Actions can either be transition actions or expiration actions. Transition actions define the storage class transition of the objects following the creation of a user-defined number of days. Expiration actions configure the deletion of versioned objects, or the deletion of delete markers or incomplete multipart uploads. This is very useful for managing costs.

An illustration is given in Figure 2.1. You can find more details here: https://docs.aws.amazon.com/AmazonS3/latest/dev/storage-class-intro.html.

Figure 2.1 – A comparison table of S3 Storage classes