April 21, 2010

Lessons I should have learned, Episode 4: Coroutines in C

Filed under: Lessons I should have learned,Programming — Tags: , , — Nathan @ 2:19 pm

A while ago, I read this article on coroutines in C. One of the main downfalls of this technique is that since it uses the static keyword, it makes it much harder to write recursive coroutines, and nigh-impossible to have multiple coroutines using the same function.

Examples

Let’s first look at an example of how it can work. One way of efficiently computing Fibonacci numbers in a language like Haskell is to construct an infinite list of the Fibonacci values.

fibs a b = a : fibs b (a+b)

Since Haskell is lazily evaluated, such an infinite list is fine until you attempt to compute the whole thing. So, as long as you just say something like take 10 $ fibs 1 1 it will only compute how many you need.

One important use of Coroutines is as iterators. Consider the Haskell example from above. If we assume that we have such an infinite list of values, we may want to iterate over the list, inspecting the values as we go. The following C code achieves this using some helper macros we’ll describe below.

int fibs(int a, int b, CoroutineState *s) {
  initializeCoroutineState(s, FibState, mystate);

  mystate->a = a;
  mystate->b = b;
  yield(s,mystate->a);
  recur (s,fibs(mystate->b, mystate->a + mystate->b, s->next));

  finalizeCoroutine(s);
  return 0;
}

Now, if we wanted to print out the first 10 of these values:

s = createCoroutineState();
for(i = 0; i < 10; i++) {
  printf("%d ", fibs(1,1,s));
}

The yield method is similar to Python's yield key word. It returns a value, but also records where the yield was called so that the next time the function is executed it can begin again from that point.

Another example of how coroutines can be used as iterators is as iterators over a binary tree. Consider the following tree:

       4
  2         6
1   3     5    7 

A standard way to traverse this tree is a post-order traversal. Consider the following code which traverses this tree and prints it out:

void print(Tree *t) {
  if(t->left)
    print(t->left);
  if(t->right)
    print(t->right);
  printf("%d ", t->value);
}

Building a post-order iterator is slightly more complex. It involves creating a stack and pushing wrapped tree-nodes onto the stack. When one is popped, it is marked as 'visited' and then pushed onto the stack, followed by its right child then left child. However, we can use coroutines to achieve the same thing while staying true to the original traversal code:

int treePostOrderIterator(Tree *t, CoroutineState *s) {
  initializeCoroutineState(s,TreeIteratorState, mystate);
  mystate->tree = t;

  if(mystate->tree->left)
    recur (s, treePostOrderIterator(mystate->tree->left, s->next));

  if(mystate->tree->right)
    recur (s, treePostOrderIterator(mystate->tree->right, s->next));

  yield (s,mystate->tree->value);

  finalizeCoroutine(s);
  return 0;
}

How it works

When I was learning to program, I learned that switch statements were just conditional structures that were equivalent to a series of if statements. While you can almost always get away with thinking about them this way, they aren't actually. Instead, think of switch statements as parameterized gotos. One of the less objectionable uses of goto is as a means of breaking out of nested loops. So, obviously goto can jump out of scopes. In fact, it can be used to jump into and out of any scope inside of a single function. The same is true of the cases of a switch statement. This is what makes Duff's device possible.

In order to facilitate recursion and having multiple coroutines running at once, we need to pass a mutable state parameter. This is an object of type struct CoroutineState.

typedef struct CoroutineState CoroutineState;
struct CoroutineState {
  void *state;            /* Pointer to a struct representing the internal state
                             of the coroutine.  We need this since we have to
                             reinstate the scope whenever the function is
                             re-intered.                                      */
  long current;           /* The line number of where the coroutine should
                             return.  Used in 'yield' and 'recur'             */
  unsigned char done;     /* A marker to show that a recurrence is completed  */
  CoroutineState *next;   /* The next in the stack of states                  */
  CoroutineState *parent; /* NULL if actually the parent, pointer otherwise   */
  CoroutineState *tail;   /* Only the parent is guaranteed to know the tail   */
};

Basically this struct stores everything needed to reinstate the coroutine when it is next called. This includes the line number it should begin executing on, a pointer to the user's saved values, and a stack containing the recursive calls.

Every coroutine begins with an invocation of the initializeCoroutineState macro. This is a macro that takes three parameters. The first is a CoroutineState object, the second is the type defining the user's saved values, and the last is the variable name to use to refer to this state.

#define initializeCoroutineState(coroutineState, UserStateType, userStateVar) \
  UserStateType *userStateVar = (UserStateType*) coroutineState->state;       \
  switch(coroutineState->current) {                                           \
  case 0:                                                                     \
  if(coroutineState->state == 0) {                                            \
    coroutineState->state = (void*) malloc(sizeof (UserStateType));           \
    bzero(coroutineState->state, sizeof(UserStateType));                      \
  }                                                                           \
  userStateVar = (UserStateType*) coroutineState->state;                      \

Note that there is a switch statement embedded here. This switch statement jumps to the appropriate line in the execution.

Every coroutine ends with finalizeCoroutine(s) which indicates that after this point the execution of the function should proceed normally. It also says that when this function returns that the coroutine is complete and should be removed from the stack. This is indicated internally with s->done =1;.

#define finalizeCoroutine(s)                                                  \
  }                                                                           \
  s->done = 1;                                                                \
  free(s->next);                                                              \
  s->next = 0;

Now, to the meat of the matter. yield needs to return the value the user wants to yield, but also record where this occurred so that the switch statement from above will jump to here. It also needs to reinstate any variables that the user had in scope. Unfortunately there's no good way of doing that in C, so we have to do it manually. Thus the need for the helper struct. Before the return in the following macro, we're just manipulating the stack so that we can record where we are. Notice coroutineState->current = __LINE__;. __LINE__ is a macro that evaluates to the current line number in the program. So, here we're recording the line number and then after the return, we have case __LINE__:. This will cause the coroutine to begin executing after the return the next time it's called.

#define yield(coroutineState, value)                                          \
  do{                                                                         \
    CoroutineState *parent = coroutineState ? coroutineState                  \
        : coroutineState->parent;                                             \
    coroutineState->current = __LINE__;                                       \
    parent->tail = coroutineState;                                            \
    free(coroutineState->next);                                               \
    coroutineState->next = 0;                                                 \
    return value;                                                             \
    case __LINE__: 1;                                                         \
  } while(0)                                                             

The last macro is perhaps the most interesting. It's the one that allows things like the infinite list iterator and the binary tree iterator. Basically, there are times when you want to have a recursive call and have the coroutine state stored along with having a return value bubble up.

The while loop is responsible for executing until the coroutine called below it is completed.

#define recur(coroutineState,func)                                            \
  do{                                                                         \
    CoroutineState *parent = coroutineState ? coroutineState                  \
        : coroutineState->parent;                                             \
    coroutineState->next = createCoroutineState();                            \
    coroutineState->next->parent = parent;                                    \
    parent->tail = coroutineState->next;                                      \
    while(!coroutineState->next->done) {                                      \
      CoroutineState *parent = coroutineState ? coroutineState                \
          : coroutineState->parent;                                           \
      typeof(func) tmp =  func;                                               \
      coroutineState->current = __LINE__;                                     \
      if(!parent->tail->done)                                                 \
        return tmp;                                                           \
      case __LINE__: 1;                                                       \
    }                                                                         \
  } while(0)                                                                  \

Using it

In order to get this to work, you have to do one extra thing. You need to create a struct which you use to store your state rather than use local-variables in your functions. This is necessary since we'll be re-entering the functions later and that state would be lost. So, just put all the variables you would want to use into the struct. Then use this struct in the initializeCoroutineState and then whenever you use a local variable.

If you would normally return a computed value (not the result of a recursive call) then just say yield (value); If you are doing a recursive call and intend for the coroutine to proceed down the recurrence, just wrap it in a recur(s, yourRecursiveCall()).

Get it

I've posted the implementation as a library on GitHub: http://github.com/nathanwiegand/coroutines.

February 22, 2010

Lessons I should have learned, Episode 3: Hot swapping binaries

About a year ago I was having a discussion with my friend Crutcher when he suggested that one could hot-swap versions of a running program. This post describes my implementation of just such a thing.

Why would you hot-swap? One of the major benefits of hotswapping is that the new version of the program will have access to all of the old version’s file-descriptors. This means that any files, sockets, or pipes that the previous version currently had open can still be open. For example, if one was careful, you could hotswap an application that was in the middle of serving a very large file to a user without him being aware that anything happened.

Before we begin discussing how this should work, let’s look at some of the problems. Sure we get file descriptors, but what about all of our state? Well, this is a problem. We cannot easily take our state with us. Since we’re updating versions here, you want to be particularly sure you only take state that you need. To do this, I would recommend using a serialization library for C. A bit of Googling showed me this one, TPL, though I haven’t tried it yet. For our example here, we’ll just manually move the only pieces of state we care about: a counter and a file descriptor to a file we’re currently writing to.

The basic idea of what’s going to happen is that we will create a pair of pipes and then fork(). The child process will hold the pipe that does the writing and the parent the one that does the reading. Now, the parent will exec. This is a bit odd. Normally when you fork, then exec, it’s the child process which does the exec. However, here we really want the new version of the program to have access to all of the old file descriptors. Luckily, execl preserves these. As an added benefit, the program gets the exact same process ID.

So, let’s look at the important bits of the hot-swap (reader and writer are the file descriptors for the pipe):

unsigned int outputFD = fileno(outputFile);

if(fork()) {
  /* I am the parent. */
  char readBuf[20] = {0};

  close(writer);
  sprintf(readBuf, "%d", reader);

  execl("./newbinary", "--hotswapping", readBuf, (char*)0);
  exit(0);
} else {
  /* I am the child.*/
  FILE *outputStream = fdopen(writer, "w");
  close(reader);

  fprintf(outputStream, "%d\n", i);
  fprintf(outputStream, "%u\n", (unsigned int) outputFD);
  fclose(outputStream);
  exit(0);
}

First, let’s look at what the parent process does. It simply closes its “writer” since it will never need to write to the pipe then it execs “newbinary” which is the new version of the program. It does so with a flag “–hotswapping”. This flag indicates another parameter will follow which is the file descriptor for the “read” end of the pipe we created. We do this so that the new binary can then get the state serialized across the pipe from the old binary.

Now, onto the child process. Line 32 creates a file handler from the file descriptor which is the “write” end of the pipe. Why? Because I’m lazy and I would prefer to work with fprintf() to write(). Now that we have this file handler, we can fprintf() directly to it and serialize the state we want. In this contrived case the only state I care about is my counter variable and the file descriptor of my output file. Line 19 gives me the descriptor from the handler using int fileno(FILE *).

So, to recap, we fork() then the parent exec’s to the new version of the binary and the child writes any relevant state to a pipe which the new binary is listening to.

Now, let’s look at what has to exist in the new binary. The new binary must recognize the argument “–hotswapping” which passes along the file descriptor of the “read” pipe. The following, in newversion.c does just this:

for(i = 0; i < argc; i++) {
  if(!strcmp(argv[i], "--hotswapping")) {
    int reader = atoi(argv[++i]);
    inputStream = fdopen(reader, "r");
  }
}

Notice that Line 17 does something interesting. It converts the file descriptor back to a file handler using fdopen(int, char*). Thus we can use this pipe just like we were reading from a file. So, now we can use fscanf to read from the pipe instead of having to worry about read() and buffers. This is done starting at line 21:

fscanf(inputStream, "%d", &i);
fscanf(inputStream, "%u", &outputFD);
fclose(inputStream);
outputFile = fdopen(outputFD, "w");

Once again, at line 24, we turn the file descriptor we read from the pipe back into a file handler. Now, we can resume writing to it, just as we did before. It will continue to append to the end of the file.

The files which implement this are available as gists here:

The output when run for 11 seconds is:

gcc -Wall -pedantic -o newbinary newversion.c
gcc -Wall -pedantic -o example original.c
./example
Original:       1 My PID=27272
Original:       2 My PID=27272
Original:       3 My PID=27272
Original:       4 My PID=27272
Original:       5 My PID=27272
New Binary:     6 My PID=27272
New Binary:     7 My PID=27272
New Binary:     8 My PID=27272
New Binary:     9 My PID=27272
New Binary:     10 My PID=27272
New Binary:     11 My PID=27272

February 17, 2010

Lessons I should have learned, Episode 2: hiding your data

Filed under: Lessons I should have learned — Tags: , , — Nathan @ 12:49 am

Episode 2: hiding your data

Let’s consider a contrived example where you are representing people and their relationships. We want to represent a person’s name, social security number, address, and have pointers to mother and father.

typedef struct Person {
  char *name;
  char *address;
  int ssn;
  Person *mother;
  Person *father;
} Person;

Now, let’s say that we have some library which will create a queriable tree for us.

Person *mandy = insertPerson("Mandy", \
                  123331234, "Hut #13" 0, 0);
Person *naughtius = insertPerson("Naughtius", \
                  349830123, "Centurianville #2", 0, 0);
Person *brian = insertPerson("Brian", \
                  593013297, "Hut #13", mandy, naughtius);

This is all well and good. But there’s a problem. The user of this library can do something like this:

Person *mary = insertPerson("Mary", \
                  012309821, "Bethleham", 0, 0);
brian->mother = mary;

There might possibly be a use-case where we want to allow this to happen, but in general this isn’t desirable whatsoever. We really want to keep the fields mother, father, and ssn immutable. We can do this in the following way:

/* FILE: library.h */
typedef struct Person {
  char *name;
  char *address;
} Person;

int getSSN(Person *);
Person *getMother(Person *);
Person *getFather(Person *);
Person *insertPerson(char *, int, char *, Person*, Person*);

Now we have a struct Person which exposes only information that we will allow to be mutable. How can we use this cleanly and effectively? We define the following:

/* FILE: library.c */
#include "library.h"
typedef struct privatePerson {
  Person p;
  int ssn;
  privatePerson *mother;
  privatePerson *father;
} privatePerson;
...

How does this help us? First, consider the following:

privatePerson *mary = ...;
Person *mom = (Person *) mary;

What will happen here? First, recall how struct layout works. Since Person p is the first entry in struct privatePerson then the offset in the privatePerson struct is 0. This means that the memory location of Person p is the same as the struct privatePerson which contains it. So, when we cast to the Person pointer we are, in fact pointing at a Person object. So, insertPerson(...) will use a struct privatePerson internally but will return a struct Person. As will getMother(...).

/* FILE: library.c */
...
Person *getMother(Person *p) {
  privatePerson *who = (privatePerson *) p;
  return (Person *) who->mother;
}

This example illustrates how we can also cast back to a privatePerson from a Person. Note that this REQUIRES that the pointer to the struct Person also be a pointer to a struct privatePerson, i.e. you’ve already cast it once. We can’t ensure that our user will never malloc his own struct Person. The best we can do is provide factories for construction (insertPerson above) and strongly document that doing so is unsupported and will lead to undefined behavior.

Because struct privatePerson is defined within library.c, it won’t be visible to anyone who includes library.h. Just like with the previous post on opaque pointers, this isn’t magic. The user can still opt-in to violate our model by including his own definition of a privatePerson struct and casting back to it. Though hopefully by doing so the user will be perfectly aware that he’s breaking our API and thus is writing code that could potentially break in the next minor point release of the library.

February 11, 2010

Lessons I should have learned, Episode 1: opaque pointers

Filed under: Lessons I should have learned — Tags: , , — Nathan @ 11:08 pm

I’ve spent my entire adult life in academia. For some reason academia, even in software engineering courses, doesn’t really harp on best practices or good patterns to know. It is possible that some do and that I’ve just missed it since I specialized in theory. So, this article will be the first in a larger category of articles which will describe patterns and techniques that I should have known but didn’t.

Episode 1: opaque pointers

I don’t care what anyone says. C is a great language. About a year ago I undertook the task of implementing a Fibonacci heap for use in an empirical analysis of two labeled graph algorithms. I put the code away months and months ago, but luckily I just found out that I got some CPU time on a very prestigious cluster so I’ve been reworking the implementations to collect better statistics. Also, since I’m finally going to do this analysis and it’s going to be published (at least in my dissertation) then I will most definitely release all of the code so that the experiments can be replayed and verified. So, since other people will look at this code I’ve spent several hours over the last few days cleaning it up and making the interfaces more reasonable. This lead me to actually pulling my fibheap out into its own stand alone project.

The fibheap isn’t ready for GitHub just yet (though look for it soon), but I’ve been working on the API and learning a lot about how this kind of thing should work. The user should be completely unaware of the internal structure of the data structure. He shouldn’t be able to mutate it in any way except through the accessors that the API provides. Though, sometimes he will need to hold pointers to particular elements in the heap (for DecreaseKey calls). This means that we need some way of providing access to elements without giving away any of our top-secret structure. So, the very simple concept that I should have known about: opaque pointers.

The crux of the matter is to separate declaration from definition just like we were always told (except in the case of templated classes… sigh). In the example of the fibheap, we want the user to have a handle to the fibheap but to otherwise not be able to affect the heap except through the API functions. So, we do this:

/* FILE: fibheap.h */
typedef struct FibHeap FibHeap;

FibHeapElement *fibHeapGetRoot(FibHeap*);
unsigned int fibHeapGetSize(FibHeap*);

And that’s it. There is no definition of the structure here, only a declaration of it. Then:

/* FILE: fibheap.c */
struct FibHeap {
  FibHeapElement *root;
  unsigned int size;
};

Because only the declaration is in the header, whenever the user does #include "fibheap.h" he only gets the name of the struct, not the layout. Thus the user cannot access either of the fields without the accessor functions fibHeapGetRoot(FibHeap*) and fibHeapGetSize(FibHeap*). No instantiation for you! (caveat: the user could easily define the struct himself and have access to the internals of the structure. This is the kind of hard opt-in that hopefully makes it clear to the user that he’s violating the spirit of the API and that this will not necessarily be stable with future versions.)

So, to summarize:

Opaque pointers allow you to provide handlers to structs to which your users have no idea the form or actual structure of.

The next installment will be of a couple of ways that you can provide partial information to the user instead of none. Stay tuned!


  • I'm a software engineer at Google
  • I'm from Alabama
  • I live in San Francisco
  • I like to work on ridiculous things
  • I'm currently learning German, Scala, and Computer Vision
  • This book referred to JavaScript I wrote when I was 15.