The AWS Relational Database Service, best known as RDS, is a service made to operate and scale relational databases in the cloud.
As we saw in the last post an EC2 instance is a virtual machine on the cloud and can host any application the user wants to. When we’re thinking about databases on the cloud why don’t we just run an EC2 instance with the desired database on it? Although the idea is valid and can be applied, it has some caveats:
- We have to manually update the database. This is valid for both minor and major updates.
- We are responsible for the Operation System running into the instance. Security updates, as well as other configurations needed, will be performed by us.
- Backups also have to be manually created, which will cause additional effort to automate.
- To avoid a single point of failure it’s necessary to configure and ELB and more than one instance on a target group, which means extra cost.
Since running a database instance is something most applications on the cloud have to do, Amazon decided to launch a specific service to handle this, this service being RDS. RDS provides a database instance abstracting the Operation System running it, making it easier to administrate. We also can choose to run our instance in a Multi-AZ deployment, which will perform the failover automatically, without having to create and manage an ELB. With RDS we can run most of the popular databases in an easier way than presented by an EC2 instance. The current supported databases are:
- Microsoft SQL Server
RDS instances use Security Groups from VPC, which means we have to explicit tell which ports will be open to users connect. Ports that aren’t specified will have traffic denied. The data storage types are also the same from EC2. We can have General Purpose SSD, which will have burstable performance and can accumulate up to 3000 IOPS. If we want to secure the number of IOPS available we can pick Provisioned IOPS SSD. For simple tests and development environment we can decide to use a Magnetic storage.
Instances can have a minimum of 5 GiB and can be as large as 3 TiB. We can store daily backups, going from 0 days up to 35 days of retention. As happens with EC2, we only pay for the difference between the snapshot taken and the previous version, which reduces costs to store backups for several days. One important thing to notice is that backups taken against the production instance might lead to I/O freezes and slow down the service. Another important thing to notice is that backups are removed with the RDS instance. If you want to keep data from the RDS is necessary to take a manual snapshot from the instance.
To avoid I/O freezes we can use a read replica. A read replica is an instance that can only be used to access data. They are supported on MySQL(InnoDB only), PostgreSQL and Aurora. On MySQL and PostgreSQL it uses the native database replication. They help to avoid I/O freezes by being the ones who RDS takes the snapshot from. Taking a snapshot from the read replica means that our master instance we’ll have a decrease in data requested, which will slow down the freezes.
What happens if we want to change database configurations? Since we don’t have access to the Operational System we need to have another method of getting into the database properties. This is made through Parameter Groups. They contain the configuration values we want to apply on the instance we’re running. Not specifying a parameter group will make RDS create their own. The default RDS parameter group will contain the default properties values of the respective database.