Magyar oldal English site

Site map
2017-09-20 16:39:52 (Originally published at: 2016-08-05)

Resource management, exceptions and templates in C

In this post I will share some patterns and practices on how to manage resources safely, how to mimic exceptions and how to make templates in C. In C++ these things are part of the language, but in C you must make something up.

Resource management

In this section we will discuss some resource management patterns.

Basic resource management pattern in functions

Let's quickly start with the one of the most important patterns: the basic resource management pattern. Let's see the abstract sample below:


int func(void)
{
    Handle handle = INVALID_HANDLE;
    int returnValue = 0;
    
    //...
    
    allocateResource(&handle);
    
    //...
    
    if (somethingWentWrong())
    {
        goto cleanup;
    }

    //...
cleanup:
    releaseResource(&handle);
    return returnValue;
}

Resources are referenced by using handles. These handles are then passed to functions that work on the given resource. A handle can be as simple thing as a pointer. But in practice they are usually a struct holding a single pointer to have some type safety when working with them. This datatype referred as Handle in the code above. Every handle type should have a designated invalid handle or null handle associated with it. It can be as simple as the NULL pointer.

Every handle you plan to use in a function must be declared at the beginning of the function (so not in a block inside.) and must be initialized to the null handle. It's a good practice in general to have every variable initialized as soon as possible. Also we must initialize the returnValue and all other possible outgoing parameters to return with something meaningful when we have to jump out.

To allocate the resource, we use the allocateResource function which takes the pointer to the handle variable. The reason for taking the address can be seen below:


void allocateResource(Handle *handle)
{
    releaseResource(handle);
    //... Resource creation code which assign the new value to *handle.
}

That is, the function must make sure the resource pointed by the handle is released before assigning a new value. This way calling allocateResource multiple times on the same handle won't leak.

The releaseResource must be able to take INVALID_HANDLE too and treat it as a no-operation. So we don't need to protect releaseResource calls with an if. It also takes a pointer to the handle, because the releaseResource must set it INVALID_HANDLE, because it's invalidated.

The next part that is important is the goto cleanup. It doesn't matter where the code screws up, you must jump to the cleanup section to properly clean up everything before returning from the function.

This should be basic structure of all functions that manage any sort of resource (most often memory buffers).

Strong and weak references

In the example above variable handle was a so called strong reference. They must be handled with care, you shouldn't use the = operator to copy them around only the allocate and release functions that manage them to avoid dangling handles or leaks. As long as an object has strong references to it must be kept alive. When the last handle to the object is released, the object itself must be released too. Let's improve on this.

Reference counting

You may have noticed that ban on the usage of the = operator on handles means you can only have just 1 variable that holds the handle to resource. Of course in practice you want to hold more than one handle to a single resource. This means we should introduce a reference counting scheme. This means we must introduce another resource management function:


void cloneHandle(Handle *target, Handle source)
{
    releaseResource(target);
    
    _addRefResource(source);
    *target = source;
}

First we release the resource as usual, then increase the reference count on the resource (which is as simple as ++ on an unsigned int field somewhere). Then copy the handle (or create new if you want).

And the releaseResource must have the following semantics:


void releaseResource(Handle *handle)
{
    if (*handle == INVALID_HANDLE) return;
    
    _removeRefResource(handle);
    if (_getRefCountResource(handle) >= 0) return;
    
    // No more references, to cleanup here.
}

If multiple handles exist to a single resource, only the reference count is decreased. When it hits zero, the resource is ready to clean up.

If you use only these functions to manage handles, you won't have a leak, and also your program cannot crash because of dangling handles because you will have no one.

Weak references

Strong references must be managed by the resource allocation, cloning and releasing functions. If you use the = operator, you create weak references. Weak references shouldn't be managed by those resource management functions. This means weak and strong references don't mix.

If you want to make sure a reference always points to a live object, you must use strong references. If you don't care if the referenced object is killed while you are pointing to it, you can use weak references.

But we have a little problem: weak references introduce the dangling handle problem. In order to use weak references safely there must be a way to detect if the underlying object is deleted to avoid the dangling problem. There is a way to do that. Let's see some part of the internal implementation of a library:


unsigned nextId = 1; /* Next id to allocate */
Resource *nextFreeRes = NULL; /* Next free resource */

typedef struct
{
    /*... other fields */
    unsigned uid;
    Resource *nextFreeRes;
} Resource;

/* Requests memory for the resource, before initialization. */
Resource *allocateResource()
{
    Resource *newResource;

    if (nextFreeRes)
    {
        /* There is a reusable free resource. Use it. */
        newResource = nextFreeRes;
        nextFreeRes = newResource->nextFreeRes;
    }
    else
    {
        /* No more free resources, allocate one. */
        newResource = malloc(sizeof(*newResource));
        if (!newResource) return NULL;
        newResource->nextFreeRes = NULL;
    }
    /* Assign its identifier; */
    newResource->uid = nextId;
    nextId++;
    
    return newResource;
}

/* Marks the object as free after it's cleaned up.. */
void deallocateResource(Resource *res)
{
    /* Link it to the free list. */
    res->nextFreeRes = nextFreeRes;
    nextFreeRes = res;
    /* Set it's uid to 0, so no handle can match it. */
    res->uid = 0;
}

/* When done with the library, walk the free list and deallocate everything. */
void finalCleanup(void)
{
    while (nextFreeRes)
    {
        Resource *res = nextFreeRes;
        
        nextFreeRes = res->nextFreeRes;
        free(res);
    }
}

/* This structure goes to the header. */
typedef struct 
{
    void *res;
    unsigned cookie;
} Handle;

/* Set a handle for a resource. */
void makeHandle(Handle *handle, Resource *res)
{
    handle->res = res;
    handle->cookie = res->uid;
}

/* Gets resource from a handle. */
Resource *getResourceFromHandle(Handle *handle)
{
    Resource *res = handle->res;
    if (handle->cookie == res->uid) return res;
    return NULL;    
}

In the code above there are two tricks. The one is the free list. When a resource is released the memory won't be thrown away, instead it will be linked into a free list. When a resource is allocated we first reuse one from the free list, and allocate only when there are nothing left to reuse.

The second trick is the cookie in the handle. Every object has an unique id when allocated. And this unique id is transferred to the handle as well. And this id in the handle is used internally to determine if the handle is valid. When the resource is released the uid in it will be set to 0, so any handles, that point to it, won't be valid anymore. When the resource is reused, a new uid will be provided so there is no chance for identifier reuse (unless when the counter overflows, but if there is a danger of that, then use 64 bit counters).

Using these patterns the problem of dangling handles can be eliminated.

There are rooms of improvement:

Exceptions

Exceptions are used to indicate a failure which a function can't recover from, so it must return. And we assume that multiple layers of functions won't be able to handle it and must return. These failures include but not limited to the situations like the out of memory errors and such. The following patterns and macros help you to handle errors in an organized way:


typedef enum
{
    ERR_OK, /* The okay value must be zero valued. */
    /* ... other error values ... */
} ErrorType;

#define ON_ERROR_LEVEL if (*error) goto cleanup
#define THROW(x) {*error=(x); goto cleanup; }

void aFunction(/* ... parameters ... */, ErrorType *error)
{
    /* ... declarations ... */
    *error = ERR_OK; /* Start the function assuming everything is ok.*/
    
    /* ... code ... */
    functionThatCanRaiseError(/* args*/, error); ON_ERROR_LEAVE; /* Propagate error. */
    /* ... code ... */
    if (some_thing_went_wrong) THROW(ERR_SOME_ERROR_CODE); /* Example of throwing error. */
    
    /*... more code ... */
cleanup:
    /* Cleanup block */    
}

The extraordinary thing is using the outgoing parameter to hold the error. Usually the return value is used for this. But I think using an outgoing parameter is better than using a return value for error. The first thing is that you can use the return value for something meaningful. The second thing is that you don't need to propagate the error value, all you need to do is just a return or jump to cleanup. If you use a return value, you need a local variable to store it while the code performs something in the cleanup and return it, and you must do this on all levels. If you use an outgoing parameter, you don't need to do this. The third thing is that you can set a conditional data breakpoint on the error variable to catch when does the error arises. Or actually you can add an extra debug printf into the THROW statement if you wish.

This pattern is a good supplement to the basic resource management pattern discussed in the first section. You can handle errors without making a too large visual clutter.

Templates

Templates are the yet another powerful feature in C++. But unfortunately they are not available in C. But there is a way to mimic them. Consider the following include file.


#ifndef T 
    #error T (data type) must be defined for this template!
#endif

#ifndef VECTOR_TYPE /* If we didn't provided better name, make one up. */
    #define VECTOR_TYPE Vector ## _ ## T
#endif

typedef struct
{
    T *buf; /* Buffer that holds the elements */
    unsigned n; /* Current size */
    unsigned allocated; /* Allocated size*/
} VECTOR_TYPE;

#define VECTOR_FUNC_NAME(fn) VECTOR_TYPE ## _ ## fn

inline void VECTOR_FUNC_NAME(add)(VECTOR_TYPE *vec, const T *element)
{
    /* Implement addition here. */
}

/* Undef everything here to avoid pollution. */
#undef T
#undef VECTOR_TYPE
#undef VECTOR_FUNC_NAME

To use it, you need to include it like this:


#define T int
#define VECTOR_TYPE IntVector
#include "vector_template.c"

Now you have a datatype IntVector, and the element addition function IntVector_add. Similarly all sorts of templates can be emulated this way.

Recently updated:

Content is available on IPFS! Feel free to host and pin my content if you like it. http://gateway.ipfs.io/ipns/calmarius.net/index.htm

comments powered by Disqus
Logo