This is one of the most commonly featured topics in AWS exams. You should have sufficient knowledge prior to the exam. In this section, you will learn about Amazon’s RDS.
AWS provides several relational databases as a service to its users. Users can run their desired database on EC2 instances, too. The biggest drawback is that the instance is only available in one Availability Zone in a Region. The EC2 instance has to be administered and monitored to avoid any kind of failure. Custom scripts will be required to maintain a data backup over time. Any database major or minor version update would result in downtime. Database instances running on an EC2 instance cannot be easily scaled if the load increases on the database as replication is not an easy task.
RDS provides managed database instances that can themselves hold one or more databases. Imagine a database server running on an EC2 instance that you do not have to manage or maintain. You need only access the server and create databases in it. AWS will manage everything else, such as the security of the instance, the operating system running on the instance, the database versions, and high availability of the database server. RDS supports multiple engines, such as MySQL, Microsoft SQL Server, MariaDB, Amazon Aurora, Oracle, and PostgreSQL. You can choose any of these based on your requirements.
The foundation of Amazon RDS is a database instance, which can support multiple engines and can have multiple databases created by the user. One database instance can be accessed only by using the database DNS endpoint (the CNAME, which is an alias for the canonical name in a domain name system database) of the primary instance. RDS uses standard database engines. So, accessing the database using some sort of tool in a self-managed database server is the same as accessing Amazon RDS.
As you have now understood the requirements of Amazon RDS, let’s understand the failover process in Amazon RDS. You will cover what services Amazon offers if something goes wrong with the RDS instance.
RDS instances can be Single-AZ or Multi-AZ. In Multi-AZ, multiple instances work together, similar to an active-passive failover design.
For a Single-AZ RDS instance, storage can be allocated for that instance to use. In a nutshell, a Single-AZ RDS instance has one attached block store (EBS storage) available in the same Availability Zone. This makes the databases and the storage of the RDS instance vulnerable to Availability Zone failure. The storage allocated to the block storage can be SSD (gp2 or io1) or magnetic. To secure the RDS instance, it is advised to use a security group and provide access based on requirements.
Multi-AZ is always the best way to design the architecture to prevent failures and keep the applications highly available. With Multi-AZ features, a standby replica is kept in sync synchronously with the primary instance. The standby instance has its own storage in the assigned Availability Zone. A standby replica cannot be accessed directly, because all RDS access is via a single database DNS endpoint (CNAME). You can’t access the standby unless a failover happens. The standby provides no performance benefit, but it does constitute an improvement in terms of the availability of the RDS instance. It can only happen in the same Region, another AZ’s subnet in the same Region inside the VPC. When a Multi-AZ RDS instance is online, you can take a backup from the standby replica without affecting the performance. In a Single-AZ instance, availability and performance issues can be significant during backup operation.
To understand the workings of Multi-AZ, let’s take an example of a Single-AZ instance and expand it to Multi-AZ.
Imagine you have an RDS instance running in Availability Zone AZ-A of the us-east-1 Region inside a VPC named db-vpc. This becomes a primary instance in a Single-AZ design of an RDS instance. In this case, there will be storage allocated to the instance in the AZ-A Availability Zone. Once you opt for Multi-AZ deployment in another Availability Zone called AZ-B, AWS creates a standby instance in Availability Zone AZ-B of the us-east-1 Region inside the db-vpc VPC and allocates storage for the standby instance in AZ-B of the us-east-1 Region. Along with that, RDS will enable synchronous replication from the primary instance to the standby replica. As you learned earlier, the only way to access our RDS instance is via the database CNAME, hence, the access request goes to the RDS primary instance. As soon as a write request comes to the endpoint, it writes to the primary instance. Then it writes the data to the hardware, which is the block storage attached to the primary instance. At the same time, the primary instance replicates the same data to the standby instance. Finally, the standby instance commits the data to its block storage.
The primary instance writes the data into the hardware and replicates the data to the standby instance in parallel, so there is a minimal time lag (almost nothing) between the data commit operations in their respective hardware. If an error occurs with the primary instance, then RDS detects this and changes the database endpoint to the standby instance. The clients accessing the database may experience a very short interruption with this. This failover occurs within 60-120 seconds. It does not provide a fault-tolerant system because there will be some impact during the failover operation.
You should now understand failover management on Amazon RDS. Let’s now learn about taking automatic RDS backups and using snapshots to restore in the event of a failure, and read replicas in the next section.