Lessons I should have learned, Episode 3: Hot swapping binaries
About a year ago I was having a discussion with my friend Crutcher when he suggested that one could hot-swap versions of a running program. This post describes my implementation of just such a thing.
Why would you hot-swap? One of the major benefits of hotswapping is that the new version of the program will have access to all of the old version's file-descriptors. This means that any files, sockets, or pipes that the previous version currently had open can still be open. For example, if one was careful, you could hotswap an application that was in the middle of serving a very large file to a user without him being aware that anything happened.
Before we begin discussing how this should work, let's look at some of the problems. Sure we get file descriptors, but what about all of our state? Well, this is a problem. We cannot easily take our state with us. Since we're updating versions here, you want to be particularly sure you only take state that you need. To do this, I would recommend using a serialization library for C. A bit of Googling showed me this one, TPL, though I haven't tried it yet. For our example here, we'll just manually move the only pieces of state we care about: a counter and a file descriptor to a file we're currently writing to.
The basic idea of what's going to happen is that we will create a pair of pipes and then fork(). The child process will hold the pipe that does the writing and the parent the one that does the reading. Now, the parent will exec. This is a bit odd. Normally when you fork, then exec, it's the child process which does the exec. However, here we really want the new version of the program to have access to all of the old file descriptors. Luckily, execl preserves these. As an added benefit, the program gets the exact same process ID.
So, let's look at the important bits of the hot-swap (reader and writer are the file descriptors for the pipe):
unsigned int outputFD = fileno(outputFile); if(fork()) { /* I am the parent. */ char readBuf[20] = {0}; close(writer); sprintf(readBuf, "%d", reader); execl("./newbinary", "--hotswapping", readBuf, (char*)0); exit(0); } else { /* I am the child.*/ FILE *outputStream = fdopen(writer, "w"); close(reader); fprintf(outputStream, "%d\n", i); fprintf(outputStream, "%u\n", (unsigned int) outputFD); fclose(outputStream); exit(0); }
First, let's look at what the parent process does. It simply closes its "writer" since it will never need to write to the pipe then it execs "newbinary" which is the new version of the program. It does so with a flag "--hotswapping". This flag indicates another parameter will follow which is the file descriptor for the "read" end of the pipe we created. We do this so that the new binary can then get the state serialized across the pipe from the old binary.
Now, onto the child process. Line 32 creates a file handler from the file descriptor which is the "write" end of the pipe. Why? Because I'm lazy and I would prefer to work with fprintf() to write(). Now that we have this file handler, we can fprintf() directly to it and serialize the state we want. In this contrived case the only state I care about is my counter variable and the file descriptor of my output file. Line 19 gives me the descriptor from the handler using int fileno(FILE *).
So, to recap, we fork() then the parent exec's to the new version of the binary and the child writes any relevant state to a pipe which the new binary is listening to.
Now, let's look at what has to exist in the new binary. The new binary must recognize the argument "--hotswapping" which passes along the file descriptor of the "read" pipe. The following, in newversion.c does just this:
for(i = 0; i < argc; i++) { if(!strcmp(argv[i], "--hotswapping")) { int reader = atoi(argv[++i]); inputStream = fdopen(reader, "r"); } }
Notice that Line 17 does something interesting. It converts the file descriptor back to a file handler using fdopen(int, char*). Thus we can use this pipe just like we were reading from a file. So, now we can use fscanf to read from the pipe instead of having to worry about read() and buffers. This is done starting at line 21:
fscanf(inputStream, "%d", &i); fscanf(inputStream, "%u", &outputFD); fclose(inputStream); outputFile = fdopen(outputFD, "w");
Once again, at line 24, we turn the file descriptor we read from the pipe back into a file handler. Now, we can resume writing to it, just as we did before. It will continue to append to the end of the file.
The files which implement this are available as gists here:
The output when run for 11 seconds is:
gcc -Wall -pedantic -o newbinary newversion.c gcc -Wall -pedantic -o example original.c ./example Original: 1 My PID=27272 Original: 2 My PID=27272 Original: 3 My PID=27272 Original: 4 My PID=27272 Original: 5 My PID=27272 New Binary: 6 My PID=27272 New Binary: 7 My PID=27272 New Binary: 8 My PID=27272 New Binary: 9 My PID=27272 New Binary: 10 My PID=27272 New Binary: 11 My PID=27272
February 22nd, 2010 - 15:39
LOL man, You blog faster than I can read.
Finish your dissertation first! I finished mine and look how well-off I am now!
February 22nd, 2010 - 17:08
Very cool idea.
One idea is that if you don’t have a lot of state to transfer, you can pass it all on the command line and skip the fork.
February 22nd, 2010 - 18:09
you forgot to make the parent waitpid on the child’s PID returned by fork. Failure to do so makes the child a zombie until the parent exists, i believe.
February 22nd, 2010 - 19:12
I’ve been doing this for my IRC bot. Good to see it get some attention; it’s a sweet technique.
February 22nd, 2010 - 20:23
Got a link to the source? I’d like to take a look at a practical implementations.
February 26th, 2010 - 15:24
I Know that WeeChat ( http://www.weechat.org/ ) has hot-swapping, which I got to use the other day when 0.3.1 released. No idea where in the source that’s located, though.
March 9th, 2010 - 13:31
Belated reply!
http://dsource.org/projects/scrapple/browser/trunk/idc/idc.d#L76
February 24th, 2010 - 03:10
This is actually a very old trick. It is used by ‘init’ which by convention is always PID 1. If you run “telinit u” as root then init will re-exec itself and use the pipe trick to maintain its state.
This is useful for example when you need to update a library like libc. Until all the open file handles to the old libc are closed, the old version still stays around in memory and on disk, so if init weren’t able to do this then you’d have to reboot to upgrade libc because init never terminates and thus it would forever keep the old version of libc around on disk (even though it is unlinked from its directory.)
March 11th, 2010 - 23:04
You should be able to combine this with an mmap’ed block of state to share gobs of state without it having to pass through the “eye of the needle” of a file handle. If you can map it/them at a fixed address, they can even hold pointers to other data within the blocks.