Understanding swap files in Linux
To appease some of my hungrier
applications and support heftier development efforts, I recently
upgraded the memory on my system from 4 GiB to 8 GiB. That gave me
occasion to tinker around with the swapping behavior of my system and
check things out.
If you run anything close to a modern operating system, you almost certainly interact with a swap file.
You might be familiar with the basics of how these work: they allow
your OS to prioritize more frequently-used pages in main memory and
move the less frequently-used ones to disk. But there's a lot going on
underneath the covers. Here's a simple guide to the theory and practice
of swap files in Linux, and how you can tweak things for your benefit.
An abstract memory model
It'll be useful to have a mental model of the way memory works in general1, so we'll start from the basis of a simple one here.
In general, all computers have access to physical memory,
where the actual bits are manipulated and stored for use. Most modern
operating systems present physical memory to higher-level applications
as an abstraction called virtual memory. This allows
applications to see memory as if it were a contiguous block, even
though the underlying physical memory may theoretically be taken from
many arbitrary, heterogeneous sources -- multiple memory chips, flash
memory, disk drives, and so on.
The memory manager divides virtual memory into chunks of identical size called pages.
A page is the smallest amount that the OS will allocate in response to
requests from programs. It is also the smallest unit of transfer
between main memory and any other location, such as a hard disk. The
size of a page is usually fixed by the operating system's kernel2.
As pages are allocated to applications, they are assigned pages in the physical memory space through a special mapping called address translation. Applications don't know where they are in the physical space; they see only the pages they use.
The memory manager knows how to aggregate different backing stores
to provide the abstraction of contiguous virtual memory. By updating
the address translation mechanism so that a virtual page always points
to the correct physical page, the memory manager may move pages around
in physical memory without directly impacting applications.
Wide variation in different kinds of physical memory
Not every backing store displays the same storage and speed
characteristics. If different types of physical memory display
different characteristics, the memory manager can exploit these
differences to optimize system performance. It can make intelligent
decisions about which pages should go where.
Consider the difference between main memory and a hard drive, for example.
characteristic | storage medium | |||
---|---|---|---|---|
on disk | in main memory | |||
absolute | relative | absolute | relative | |
access time | 5ms | 1× | 10ns | 500,000× faster |
peak transfer rate | 80 MB/s | 1× | 8 GB/s | 100× faster |
storage space | 100 GB | 25× larger | 4 GB | 1× |
price/GB storage | $0.60/GB | 20× cheaper | $12/GB | 1× |
Because storage on disk-backed file systems does indeed have
different characteristics than storage in main memory, there's
significant room to optimize depending on the characteristics of how
pages are accessed. As the table shows, hard disks are several orders
of magnitude slower and less efficient at retrieving data than main
memory. Thus, the memory manager needs to carefully balance the demand
for memory against the different kinds of supply.
Flavors of pages
If all pages were of the same kind, deciding which pages should go
where would be a simpler decision. For example, one strategy is to make
the access time quickest for the pages that are accessed the most
frequently. Unfortunately, it's not just the types of memory that are
heterogeneous, but also the contents of the pages that are stored
there. On Linux, there are four different kinds of pages.
- Kernel pages. Pages holding the program contents of the
kernel itself. Unlike the other flavors, these are fixed in memory once
the operating system has loaded and are never moved. - Program pages. Pages storing the contents of programs and libraries. These are read-only, so no updates to disk are needed.
- File-backed pages. Pages storing the contents of files on
disk. If this page has been changed in memory (for example, if it's a
document you're working on), it will eventually need to be written out
to disk to synchronize the changes. - Anonymous pages. Pages not backed by anything on disk.
When a program requests memory be allocated to perform computations or
record information, the information resides in anonymous pages.
When the in-memory version of a page is the same as the one on disk, we say that the page is clean;
the contents are the same. But sometimes the contents of a page have
been updated since the last time they were read. When this happens, the
page becomes dirty.
A clean page can be repurposed for something else easily; no updates
need to be made, and the page can simply be recycled. But a dirty page
has to be written back to disk before it can be used again. For file
pages, this is an expensive operation, so the kernel tries to avoid the
overhead of flushing back to disk when it can.
For anonymous pages, there's a different problem. Effectively,
they're always dirty: the very act of creating the anonymous page means
that there is now data that is in memory which isn't in disk. If the
kernel wants to use anonymous pages for something else, it must first
reclaim them. But anonymous pages have no files to back them. How can
you flush something back to disk when there's nowhere to flush it to?
Swap files
The use of swap can resolve many of these issues. Swap is a disk-backed area that's treated as an extension of main memory. It serves as a holding area for pages that have been evicted by the kernel. Let's use an illustrative example to show how swap files help make memory work better.
When a moderately loaded system gets additional requests for memory,
the kernel generally draws from the pool of free pages first to fulfill
these requests. If there are few free pages remaining, the kernel tries
to flush clean pages to make room for the new requests.
If the clean pages also become depleted, the kernel is forced to
clean a dirty page and then flush it. This is an expensive operation.
For this reason, the kernel tries to maintain at least some clean pages
all the time.
When the ratio of anonymous pages to dirty pages is high and the
number of clean pages is low, the kernel is running out of memory.
Without swap, this situation will require a number of costly disk
writes. Consider a request for allocation when a number of dirty pages
are already present.
In the figure above, a request for two anonymous pages has come in.
There are no more unused pages, so the kernel must drop one of the
existing pages to satisfy the request. The kernel can use one page
freely: the single clean file page in slot 6.
But to allocate the second page, the kernel now has to flush one of
the dirty file pages (in slots 1, 3, or 4) back to disk to make room.
It cannot move the page in slot 5 anywhere, because it is anonymous and
has no backing store; there's nowhere else to put it. Even if this page
has not been used in a very long time, it must still occupy space in
memory until the process using it has released the page.
Without swap, the kernel gets boxed into this unfortunate corner more easily.
With swap, however, the kernel gets an additional tool to use in its
arsenal. Instead of being forced to clean one of the dirty pages, it
can instead evict one of the anonymous pages to the swap region.
As in the earlier non-swap scenario, the kernel use can use the
clean page in slot 6 for the first requested page. It is allocated and
the clean page is dropped.
For the second requested page, the kernel must no longer clean a
dirty page to make room. Instead, it can simply flush one of the
anonymous pages to the swap region. The code required to do this is
generally very simple and significantly less complex than cleaning a
dirty page, and the kernel prefers swapping to cleaning dirty
file-backed pages.
Optimizing your swap settings
Linux provides a number of ways to interact with your swap. Two are detailed here:
- Aggressiveness of swapping
- Adding and removing additional swap containers
Controlling aggressiveness of swapping
The more aggressively the kernel swaps, the more efficiently
existing memory can be put to use. Pages that look like they're not
being used will be swapped out rapidly. If the kernel swaps too often,
though, applications that were using those pages will take longer to
become responsive again as the kernel swaps their memory back into main
memory.
For a desktop user, responsiveness of applications can be important,
so an aggressive swap may not be desirable, even if it results in less
efficient use of memory. For servers and other non-interactive systems,
more aggressive swapping may be appropriate and acceptable.
On Linux, this careful balancing act can be configured to meet your
personal preferences. The kernel swaps out pages with a zealousness
controlled by a swappiness setting.
Swappiness is an integer that ranges from 0 to 100, and indicates
the degree to which the kernel favors swap space over main memory.
Higher swappiness means that the kernel will move things to swap more
frequently. Lower swappiness means that the kernel tries to avoid using
swap. A swappiness of zero causes the kernel to avoid swap for as long
as possible.
Ubuntu and several other Linux distributions have a default swappiness of 60. You can check your swap setting by reading a /proc/sys value:
$ cat /proc/sys/vm/swappiness
60
To temporarily modify your swappiness, simply edit this value:
$ sudo sysctl vm.swappiness=40
vm.swappiness = 40
This setting lasts until reboot or you change it again with another sysctl vm.swappiness invocation. To make this setting take effect on every reboot, edit your /etc/sysctl.conf configuration file.
$ gksudo gedit /etc/sysctl.conf
Find the vm.swappiness line; if none exists, add it.
vm.swappiness = 40
Adding swap containers
Modern operating systems generally have either a swap partition or a swap file.
In a swap partition, part of the hard drive is sliced off and becomes
dedicated to swap. A swap file is just an ordinary file that holds up
to its file size in swapped pages.
A swap file is considerably less complicated than a swap partition to establish. There is no speed difference between the two3,
so swap files are favorable in this respect. However, if you want to be
able to hibernate or suspend your computer, using a swap partition is
required in some cases. (These suspend/hibernate managers usually
cannot handle writing to an active file system.)
Making a new swap file is a simple process. In this example, we'll
make a 2 GiB swap file and make it available to the system as
additional swap space. We'll use primary.swap as the name
of the example swap file, but there is nothing special about the name
of the file or its extension. You may use anything you wish.
First, we need to create the swap file itself. We'll use a stream of zeroes as the input source (if=/dev/zero), and write it out to a file named primary.swap in the /mnt directory (of=/mnt/primary.swap). We will write 2048 (count=2048) blocks each 1 MiB in size (bs=1M). Depending on the speed of your hard disks, this may take a little while.
$ sudo dd if=/dev/zero of=/mnt/primary.swap bs=1M count=2048
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB) copied, 30.1085 s, 71.3 MB/s
Next, we need to format this file and prepare it for use as a swapping space. The mkswap utility sets up a swap area on a device or file.
$ sudo mkswap /mnt/primary.swap
Setting up swapspace version 1, size = 2097148 KiB
no label, UUID=7be2b3b6-83b0-4afd-8537-197cf12f8c59
After formatting it, the swap can now be added to our system. Use the swapon utility to activate the swap region.
$ sudo swapon /mnt/primary.swap
You can verify that your swap space is now 2 GiB larger.
$ cat /proc/meminfo | grep SwapTotal
SwapTotal: 2097144 kB
Your changes will be lost at reboot, so if you want to make them permanent we'll need to edit your filesystem table in /etc/fstab.
$ gksudo gedit /etc/fstab
Now add your swap file to the list of filesystems to mount at boot by appending a line to the file.
/mnt/primary.swap none swap sw 0 0
Removing a swap file
Removal works much the same way, but in reverse. If you've added your swap to the /etc/fstab list, you need to remove it here first.
To disable your running swaps, run the swapoff utility. You can either specify the swap you'd like to disable, or use the -a parameter.
$ sudo swapoff /mnt/primary.swap
When you disable swap, you force the kernel to clean every page on
the swap and/or push it back to main memory. If there is not enough
space to squeeze everything in, you may receive out of memory errors
from the kernel, so use this judiciously.
Conclusion
Swap files are an essential part of the memory-management modules of
operating systems. In Linux, adding and removing swap partitions and
files is simple, and you can control how the kernel interacts with swap
through configurable parameters. Through the use of these and other
techniques, and with an understanding of the basics of swap, you can
tweak your system's use of memory to your heart's content.
Additional reading
- Speed up your system by avoiding the swap file. FOSSWire. Accessed February 8th, 2009.
- 2.6 swapping behavior. LWN.net. Accessed February 8th, 2009.
- Swap FAQ. Ubuntu Documentation. Accessed February 12th, 2009.
- Patterson, David A. and Hennessy, John L. Computer Organization and Design: The hardware/software interface. © 1997. Morgan Kaufmann Publishers, San Francisco, California.
1 As this is only a model, we will naturally be leaving off some of the important but messier details.
2 On Linux, you can check your page size with the getconf command, specifying the PAGE_SIZE parameter. This returns the number of pages in bytes. 4 kilobytes (4,096 bytes) is a typical result on the x86 and x86_64 architectures, so there are 256 pages / MB.
$ getconf PAGE_SIZE
4096