Sometimes, swap space still matters

15 07 2011
Most modern, general purpose Operating Systems (OS), come with a full-fledged Virtual Memory (VM) system that generates the illusion of having more memory than the real amount installed in the machine. Whether this virtual memory is backed by real RAM, disk swap or it just isn't backed by any physical device, is something up to the OS.

This way, you can reserve (though the man page of malloc() says allocate, when you know how it works behind the scenes, reserve sounds better) 2 GB of memory in a machine with only 1GB and no swap. Of course, if you try to use that amount memory within your application, the OS will have a hard time trying to make space for it to run (probably reaping file sytem caches and buffers), and at some point it will take the decision of either killing the process (like the OOM Killer of Linux) or refusing to provide more pages to the space of that process, causing this one to die with a segmentation fault.

This ability of allowing the processes to reserve more memory than the amount physically available, is possible thanks to on-demand paging. This means that physical resources (in this case, memory pages) are not consumed by a process until it really access them:

#define SIZE 1024*1024*100

int main()
{
int i;
char *buf;

buf = (char *) malloc(SIZE);

/* At this point, memory usage on the OS shouldn't have changed considerably */

for (i=0; i<SIZE; ++i) {
buf[i] = 'A';
}

/* Now, free memory in the OS should have decreased by 100 MB */

free(buf);
}

Many applications take advantage of this feature, and reserve more memory than they really use in all their lifecycle (if this is a proper behavior or not, is something beyond the scope of this article). You can check this in your own UNIX OS by comparing the columns VIRT and RSS (or RES) of the top utility.


So, how much memory can be reserved?

This is something that depends entirely on the OS you're running. But most of them calculate a "safe" value determined by the amount of RAM and the size of the disk swap. This is the case for Solaris, which goes even further by strictly limiting the amount of memory to be reserved to the sum of the RAM and the swap minus 1/8 of the first one.


What could happen if you configure a low amount of swap space? (A real world example and the motivation of this article)

This morning, in one of our OpenSolaris servers, we started to receive messages like this one: "WARNING: Sorry, no swap space to grow stack for pid ..." In vmstat, the swap column (a misleading name, since it's the amount of virtual memory available to be reserved, and not something strictly related to physical swap space) showed pretty low numbers (under 50MB) while the free column was telling us that there were over 14 GB of physical RAM available. How can this be possible?

This machine has 32 GB of RAM and only 512 MB of swap (the default size in OpenSolaris, my mistake). This means that the total amount of virtual memory available to be reserved is something around 30 GB. It provides CIFS service to the network with SAMBA, thus there're lots of "smbd" processes running on it, and each process usually has a VIRT size of 40 MB, and a RSS of 20 MB. With 1000 processes, the required amount of physical memory would be a little under 20 GB (it fits in RAM), but the amount of virtual memory is something near 40 GB.

This way, when our system reached the limit of virtual memory (over 30GB), it still had more than 10 GB of real RAM available, but the OS refused to allow more reservations. And since there's still plenty of free pages, the kernel doesn't try to rebalance the size of it's buffers, leaving you with and exhausted system with lots of free memory.

So, keep in mind: Even with lots of RAM installed in your server, sometimes, swap space still matters.




Trackbacks


No Trackbacks

Comments

Display comments as (Linear | Threaded)
20 07 2011
#1 Weng Fu
Could you please provide some Visual Basic examples for this concept? I really like the concept but I think it would appeal to more people if you used a programming language that many people are familiar with. The Solaris language is not so popular.
20 07 2011
#1.1 Robert Sheets
The example is written in C, one of the most popular programming languages of all time. Solaris is an operating system, not a computer language.
20 07 2011
#2 bp
Informative article, thank you.
20 07 2011
#3 anonymous
your code is full of errors :-)
first you didn't include stdlib.h for malloc
second (char ) at malloc is obsolete and shall not be used
third you cannot do free, since you've modified buf pointer
have you ever tried to run it ???

FTFY:

#include

int
main()
{
size_t sz = 100
1024 * 1024;
char *p, *s;

s = p = malloc(sz);

if (!p)
return -1;

while (sz-- > 0)
*s++ = 'A';

free(p);

return 0;
}
20 07 2011
#3.1 Sergio Lopez
1.- I've omitted all the headers. Providing a full ready_to_compile example wasn't my intent.

2.- Type casting the pointer returned by a malloc() can be considered obsolete, but it's still perfectly valid.

3.- You're right, thanks for pointing it out.

On the other side, the while loop in your example, while valid, hurts readability, which is precisely the most important thing in a code example.
20 07 2011
#3.1.1 Steinar H. Gunderson
If you want readability, stop mucking around with pointer arithmetic. The compiler is perfectly able to do it itself if there is a win (but most likely, it will just convert your example back to indexed arithmetic). Ie.:

for (i=0; i
20 07 2011
#3.1.2 Steinar H. Gunderson
(Gah, comment system messing up my text; hoping it's better this time)

If you want readability, stop mucking around with pointer arithmetic. The compiler is perfectly able to do it itself if there is a win (but most likely, it will just convert your example back to indexed arithmetic). Ie.:

for (i=0; i &lt; SIZE; ++i) {
buf[i] = 'A';
}

or, of course:

memset(buf, 'A', SIZE);

/* Steinar */
20 07 2011
#3.1.2.1 Sergio Lopez
You're right. Thanks for your suggestion.
20 07 2011
#3.1.2.1.1 Steinar H. Gunderson
Note that now you don't need orig_buf anymore :-)

/ Steinar /
20 07 2011
#3.1.2.1.1.1 Sergio Lopez
Ouch! This is what happens when you try to pay attention to 10 things at a time. Thanks again!

Add Comment


Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
Standard emoticons like :-) and ;-) are converted to images.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA