Distinguishing between object tags and object metadata – AWS Services for Data Storage – MLS-C01 Study Guide

Distinguishing between object tags and object metadata

Let’s compare these two terms:

  • Object tag: An object tag is a key-value pair. AWS S3 object tags can help you filter analytics and metrics, categorize storage, secure objects based on certain categorizations, track costs based on certain categorization of objects, and much more besides. Object tags can be used to create life cycle rules to move objects to cheaper storage tiers. You can have a maximum of 10 tags added to an object and 50 tags to a bucket. A tag key can contain 128 Unicode characters, while a tag value can contain 256 Unicode characters.
  • Object metadata: Object metadata is descriptive data describing an object. It consists of name-value pairs. Object metadata is returned as HTTP headers on objects. They are of two types: one is system metadata, and the other is user-defined metadata. User-defined metadata is a custom name-value pair added to an object by the user. The name must begin with x-amz-meta. You can change all system metadata such as storage class, versioning, and encryption attributes on an object. Further details are available here: https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html.

Important note

Metadata names are case-insensitive, whereas tag names are case-sensitive.

In the next section, you are going to learn about controlling access to buckets and objects on Amazon S3 through different policies, including the resource policy and the identity policy.

Controlling access to buckets and objects on Amazon S3

Once the object is stored in the bucket, the next major step is to manage access. S3 is private by default, and access is given to other users, groups, or resources via several methods. This means that access to the objects can be managed via Access Control Lists (ACLs), Public Access Settings, Identity Policies, and Bucket Policies.

Let’s look at some of these in detail.

S3 bucket policy

An S3 bucket policy is a resource policy that is attached to a bucket. Resource policies decide who can access that resource. It differs from identity policies in that identity policies can be attached or assigned to the identities inside an account, whereas resource policies can control identities from the same account or different accounts. Resource policies control anonymous principals too, which means an object can be made public through resource policies. The following example policy allows everyone in the world to read the bucket because Principal is rendered *:

{
  “Version”:”2012-10-17″
  “Statement”:[
    {
      “Sid”:”AnyoneCanRead”,
      “Effect”:”Allow”,
      “Principal”:”*”,
      “Action”:[“s3:GetObject”],
      “Resource”:[“arn:aws:s3:::my-bucket/*”]
    }
    ]
}

By default, everything in S3 is private to the owner. If you want to make a prefix public to the world, then Resource changes to arn:aws:s3:::my-bucket/some-prefix/*, and similarly, if it is intended for a specific IAM user or IAM group, then those details go in the principal part in the policy.

There can be conditions added to the bucket policy too. Let’s examine a use case where the organization wants to keep a bucket public and whitelist particular IP addresses. The policy would look something like this:

{
  “Version”:”2012-10-17″
  “Statement”:[
    {
      “Sid”:”ParticularIPRead”,
      “Effect”:”Allow”,
      “Principal”:”*”,
      “Action”:[“s3:GetObject”],
      “Resource”:[“arn:aws:s3:::my-bucket/*”],
      “Condition”:{
        “NotIpAddress”:{“aws:SourceIp”:”2.3.3.6/32″}
      }
    }
    ]

}

More examples are available in the AWS S3 developer guide, which can be found here: https://docs.aws.amazon.com/AmazonS3/latest/dev/example-bucket-policies.html.

Block public access is a separate setting given to the bucket owner to avoid any kind of mistakes in bucket policy. In a real-world scenario, the bucket can be made public through bucket policy by mistake; to avoid such mistakes, or data leaks, AWS has provided this setting. It provides a further level of security, irrespective of the bucket policy. You can choose this while creating a bucket, or it can be set after creating a bucket.

Identity policies are meant for IAM users and IAM roles. Identity policies are validated when an IAM identity (user or role) requests to access a resource. All requests are denied by default. If an identity intends to access any services, resources, or actions, then access must be provided explicitly through identity policies. The example policy that follows can be attached to an IAM user and the IAM user will be allowed to have full RDS access within a specific Region (us-east-1 in this example):

{
    “Version”: “2012-10-17”,
    “Statement”: [
        {
            “Effect”: “Allow”,
            “Action”: “rds:*”,
            “Resource”: [“arn:aws:rds:us-east-1:*:*”]
        },
        {
            “Effect”: “Allow”,
            “Action”: [“rds:Describe*”],
            “Resource”: [“*”]
        }
    ]

}

ACLs are used to grant high-level permissions, typically for granting access to other AWS accounts. ACLs are one of the sub-resources of a bucket or an object. A bucket or object can be made public quickly via ACLs. AWS doesn’t suggest doing this, and you shouldn’t expect questions about this on the test. It is good to know about this, but it is not as flexible as the S3 bucket policy.

Now, let’s learn about the methods to protect our data in the next section.