Articles

How to Use the 'top' Command

If you notice an increase in CPU or memory usage on your server, the top command is one of the easiest and most reliable means of determining what is using those resources in real time.

To use top, simply SSH in to your server as root as enter the following:

top -c

A lot of information will appear in your terminal and will refresh every three seconds; it will have a similar layout to the one below.

Understanding the System Summary

The first five lines of your top results show a summary of your server in the moment.

Line One: Uptime and Load Averages

The first grouping of text in the first line shows how long the server has been running since its last reboot. The current time is to the left and the duration the server has been running since the last reboot is to the right. In this example, the server has been running 49 days, 19 hours, and 50 minutes since reboot:

The next grouping shows how many active users are logged in to over SSH. In this example, only one user is logged in:

The final grouping is some of the most important data: the server's load averages over three time periods, 1 minute, 5 minutes, and 15 minutes.

A general guideline to interpreting those numbers is load should be less than or equal to the number of processors your server has.

The three averages are given to help you know whether a load spike happened for a very short amount of time or a very long one. If only the one-minute average is high, it was a transient spike. If the five-minute average is also higher than normal, the issue persisted for at least five minutes. If the 15-minute average is high, it was an even longer event.

Short load spikes can frequently occur due to brief high-resource processes, and these are typically not a problem (unless they are high enough to cause a serious process backlog on a busy server). Longer load spikes could indicate a more pressing situation—a user could be running a website with too many plugins or someone could be performing a long-running task that is elevating the load.

Line Two: Tasks

The second line of top results show the total number of tasks present on your server and their current states. "Tasks" are the processes running on your server.

In this example, you can see the server has 143 total tasks, 1 running task, 142 sleeping tasks, 0 stopped tasks, and 0 zombie tasks.

These tasks are defined as follows:

  • Running—This task is actively using the CPU.
  • Sleeping—This task is in a waiting state; it may be waiting on disk activity, on remote content, or on the CPU to attend to it if the CPU is too busy.
  • Stopped—This task has been actively stopped by a STOP signal; it will remain stopped until told to continue.
  • Zombie—This task is defunct; it has completed its execution, but a line still exists for it in the process table. A few zombie tasks are nothing to worry about; however, many zombies may need to be examined.

Line Three: CPU States

The third line comprises several abbreviations and symbols.

The overall percentage of CPU usage across all cores, listed as %Cpu(s), is divided into the following:

  • us or "user"—This indicates how much of the CPU is being taken up by processes that have the standard priority.
  • sy or "system"—This shows how much CPU time is being spent running kernel-level processes.
  • ni or "nice"—This is the amount of CPU time spent on low priotity, or "nice," tasks.
  • id or "idle"—This is the amount of CPU time spent idle.
  • wa or "wait," specifically "I/O wait"—I/O stands for "input/output," which pertains to disk read/write activity; this can be a common source of load, especially on non-SSD servers.
  • hi or "hardware interrupts"—This percentage does not have much relevance when analyzing load.
  • si or "software interrupts"—This also does not have much relevance.
  • st or "steal"—This is related to how much CPU time is being taken up on a virtual machine's hypervisor and is usually safe to ignore unless it is a high number; high levels of steal can indicate a "noisy neighbor" problem and should be reported to your web host.

The percentages across all eight categories will always total 100 percent.

Lines Four and Five: Memory Use

These two lines show the state of the system's memory (RAM) measured in kilobytes (KiB). Read our companion tutorial to view your server's memory usage in megabytes (MB).

The first line of memory use shows actual physical memory; this is listed as the total memory, the amount of memory being actively used, the amount of free memory, and the amount of memory being used by cached or buffered files.

Files stored in RAM can be accessed orders of magnitude more quickly than files stored on the hard drive, so the server automatically places some resources into memory to speed things up. If that memory is needed by processes, that memory is instantly freed.

The second line of memory use shows the virtual memory, also called "swap" or "swap space."

If you see a lot of swap space being used on the second line, the server is running out of RAM and is using hard drive space to temporarily store the files it would instead place in RAM. This is very inefficient and will slow your server down considerably.

You never want the server to use swap space; when swap space is being used, it's time to consider upgrading your server's RAM.

Understanding the Process List

The process list contains the detailed information about your server's resource consumption.

Each process is a task running on the server, and in this view, each row is a single process and each column contains data about that process.

In this default presentation, the columns represent the following:

  • PID—The process identifier.
  • USER—The system user running that process.
  • PR—The priority of the task.
  • NI—The niceness of the task.
  • VIRT—The total amount of virtual memory of your server.
  • RES—The nonswapped physical memory a task is using.
  • SHR—The amount of shared memory available to a task.
  • S—The status of the process, listed as uninterruptible sleep (D), running (R), sleeping (S), traced or stopped (T), or zombie (Z).
  • %CPU— percentage of CPU that single process is using on its own.
  • %MEM—The percentage of RAM that process is using.
  • TIME+—The total amount of CPU time that process has used since the last screen refresh, given in hundredths of a second.
  • COMMAND—The command name or the command line used to start the process.

In the majority of cases, USER, %CPU, %MEM, and COMMAND will give you the most essential information needed to understand what is using the most resources on your server.

Analyzing top to Determine Resource Usage

In the following example, an analysis of the process list shows MySQL appearing first; this is a common scenario and not necessarily a bad thing as MySQL can be resource heavy on a busy server.

Another commonly seen process toward the beginning of this list is www-data, which belongs to Nginx on a ServerPilot-managed machine.

Unless you are experiencing a load spike or other degraded server performance, the MySQL and Nginx processes are typically nothing to worry about.

When viewing top, your biggest focus should be the PHP processes of your individual apps' system users.

Look at the highlighted processes in this example:

On a ServerPilot server, the USER column will display the system user that owns the app listed in the same row. If you are on the free plan, that will always be serverpilot; if you are on a paid plan, it can be other names.

The COMMAND column should display the app's name.

When you have found the system user and app using the most resources, you can check your log files to determine the cause of any load issues. Two notable logs to check are the PHP slow log and the MySQL slow log.


Last updated: July 19, 2017