February 11, 2010

Lessons I should have learned, Episode 1: opaque pointers

Filed under: Lessons I should have learned — Tags: , , — Nathan @ 11:08 pm

I’ve spent my entire adult life in academia. For some reason academia, even in software engineering courses, doesn’t really harp on best practices or good patterns to know. It is possible that some do and that I’ve just missed it since I specialized in theory. So, this article will be the first in a larger category of articles which will describe patterns and techniques that I should have known but didn’t.

Episode 1: opaque pointers

I don’t care what anyone says. C is a great language. About a year ago I undertook the task of implementing a Fibonacci heap for use in an empirical analysis of two labeled graph algorithms. I put the code away months and months ago, but luckily I just found out that I got some CPU time on a very prestigious cluster so I’ve been reworking the implementations to collect better statistics. Also, since I’m finally going to do this analysis and it’s going to be published (at least in my dissertation) then I will most definitely release all of the code so that the experiments can be replayed and verified. So, since other people will look at this code I’ve spent several hours over the last few days cleaning it up and making the interfaces more reasonable. This lead me to actually pulling my fibheap out into its own stand alone project.

The fibheap isn’t ready for GitHub just yet (though look for it soon), but I’ve been working on the API and learning a lot about how this kind of thing should work. The user should be completely unaware of the internal structure of the data structure. He shouldn’t be able to mutate it in any way except through the accessors that the API provides. Though, sometimes he will need to hold pointers to particular elements in the heap (for DecreaseKey calls). This means that we need some way of providing access to elements without giving away any of our top-secret structure. So, the very simple concept that I should have known about: opaque pointers.

The crux of the matter is to separate declaration from definition just like we were always told (except in the case of templated classes… sigh). In the example of the fibheap, we want the user to have a handle to the fibheap but to otherwise not be able to affect the heap except through the API functions. So, we do this:

/* FILE: fibheap.h */
typedef struct FibHeap FibHeap;

FibHeapElement *fibHeapGetRoot(FibHeap*);
unsigned int fibHeapGetSize(FibHeap*);

And that’s it. There is no definition of the structure here, only a declaration of it. Then:

/* FILE: fibheap.c */
struct FibHeap {
  FibHeapElement *root;
  unsigned int size;
};

Because only the declaration is in the header, whenever the user does #include "fibheap.h" he only gets the name of the struct, not the layout. Thus the user cannot access either of the fields without the accessor functions fibHeapGetRoot(FibHeap*) and fibHeapGetSize(FibHeap*). No instantiation for you! (caveat: the user could easily define the struct himself and have access to the internals of the structure. This is the kind of hard opt-in that hopefully makes it clear to the user that he’s violating the spirit of the API and that this will not necessarily be stable with future versions.)

So, to summarize:

Opaque pointers allow you to provide handlers to structs to which your users have no idea the form or actual structure of.

The next installment will be of a couple of ways that you can provide partial information to the user instead of none. Stay tuned!

10,441 Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.


  • I'm a software engineer at Google
  • I'm from Alabama
  • I live in San Francisco
  • I like to work on ridiculous things
  • I'm currently learning German, Scala, and Computer Vision
  • This book referred to JavaScript I wrote when I was 15.