How to Use Autoscaling

Autoscaling should only be used if you or one of your team members is an expert sysadmin.

If used by developers who are not experienced with server administration, the complexity involved in autoscaling will result in downtime and other breakage.

This is an advanced tutorial.

ServerPilot cannot provide support with autoscaling.

How Autoscaling Works

Autoscaling is an approach to handling changes in the volume of requests your app is receiving by dynamically adding and removing servers.

An autoscaling system requires the following components:

Autoscaler. An autoscaling service monitors system status, makes scaling decisions, and interacts with a server provider's API to provision new servers and reconfigure a load balancer to send traffic to the new servers once they are online. The autoscaler will also remove servers from the load balancer and destroy them when they are no longer needed.

Server image for app servers. The autoscaler creates all new servers from a server image you have created specifically for this purpose. This server image must be ready to serve requests once it is brought online.

Remote database. As there will be multiple app servers that are created and deleted dynamically, your app's database cannot reside on the app servers. You must use a central database that is accessible from all of your app servers.

Choosing an Autoscaling Service

Although it is possible to write your own autoscaling service, the simplest and most popular approach is to use your provider's autoscaling service.

Learn about autoscaling on Amazon Web Services (AWS).
Learn about autoscaling on Google Compute Engine (GCE).

These autoscaling services allow you to define the parameters of your autoscaling configuration, such as the minimum and maximum number of servers and the levels of average CPU usage that should trigger increasing or decreasing the number of servers that are running.

Preparing a Server Image for Autoscaling

Autoscaling services require a server image (a snapshot) that will be used for creating additional instances of your server.

To safely use multiple instances of a server managed by ServerPilot, you must uninstall ServerPilot from the server before you create the image. If you do not uninstall ServerPilot before creating the image, you will run into problems, breakage, and downtime.

The recommended approach to creating the image for autoscaling involves three steps:

Create a clone of your original server.
Customize this temporary server.
Create an image of the temporary server. You will use this image with your autoscaling service.

To uninstall ServerPilot from the temporary server, SSH in as root and run the following command:

sudo apt-get remove 'sp-serverpilot-*'

The above command will only remove ServerPilot. It will not change your server's configuration.

Now that you have removed ServerPilot from this temporary server, you can also make any other customizations you need in your autoscaling image. For example, you should deploy your app's latest codebase to this temporary server before you create the image you use with autoscaling.

Using a Central Database

For your central database that is accessible from all app servers, you can do one of the following:

Use a server managed by ServerPilot that you have customized to enable remote MySQL access from your app servers.
Use a hosted MySQL database from your cloud provider, such as Amazon RDS or Google Cloud SQL.

Unless you have a MySQL expert on your team who is experienced with scaling MySQL, you will generally want to use your provider's cloud MySQL service.

How to Use Autoscaling

How Autoscaling Works

Choosing an Autoscaling Service

Preparing a Server Image for Autoscaling

Using a Central Database

Launch your first site in 5 minutes