6

I just wrote a nice library that handles nicely a dynamic array allocated on heap in C. It supports many operations, is relatively simple to use "feeling" almost like a regular good old array. It is also easy to simulate many data structures based on it (stacks, queues, heaps, etc...)

It can handle arrays of any type, but the problem is that there's only a single type per compilation. C doesn't have templates, so it's impossible to have for example dynamic arrays of ints and dynamic arrays of chars in the same program, which is a problem.

I haven't found any real solution, everything that I found involved void*, and NO, I do NOT want an array of void pointers. It's nice to be able to put pointers, but I also want to be able to have an array of a raw data type. (such that you can add for example, 3, and access it like : array.data[i]

Should I :

  1. Copy/paste the library once for every type I want to use (horrible, but it'll work and be efficient)

  2. Make it a giant macro, that I can expand with the type I want (so it'll have the same effect as 1, but be somewhat elegant and usable)

  3. Have the size of pointed elements be a variable that is part of the dynamic array structure. Will work mostly but there will be a problem with functions taking and returning the dynamic array type directly. void* will not always be a viable option

  4. Abandon the idea and use C++ instead whenever I need such advanced features

The library works like this : Usage

/* Variable length array library for C language
 * Usage :
 * Declare a variable length array like this :
 *
 * da my_array;
 *
 * Always initialize like this :
 * 
 * da_init(&da);             // Creates a clean empty array
 *
 * Set a length to an array :
 *
 * da_setlength(&da, n);     // Note : if elements are added they'll be uninitialized
 *                             // If elements are removed, they're permanently lost
 *
 * Always free memory before it goes out of scope (avoid mem leaks !)
 *
 * da_destroy(&da);
 *
 * Access elements much like a normal array :
 *   - No boundary checks :           da.data[i]
 *   - With boundary checks (debug) : da_get(data, i)
 *
 * da.length;    // Return the current length of the variable length array (do NOT set the length by affecting this !! Use da_setlength instead.)
 *
 * You can add single elements at the end and beginning of array with
 *
 * da_add(&da, value);       // Add at the end
 * da_push(&da, value);      // Add at the front
 *
 * Retrieve values at the end and front of array (while removing them) with
 *
 * da_remove(&da);          // From the end
 * da_pop(&da);             // From the front
 *
 * Concatenate it with a standard array or another variable length array of same type with
 *
 * da_append(&da, array, array_length);  // Standard array
 * da_append(&da, &db);                 // Another variable length array
 */

Implementation (I'm sorry it's huge, but I have to give it for completeness of the question)

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

// Increment by which blocks are reserved on the heap
#define ALLOC_BLOCK_SIZE 16

// The type that the variable length array will contain. In this case it's "int", but it can be anything really (including pointers, arrays, structs, etc...)
typedef int da_type;

// Commend this to disable all kinds of bounds and security checks (once you're sure your program is fully tested, gains efficiency)
#define DEBUG_RUNTIME_CHECK_BOUNDS

// Data structure for variable length array variables
typedef struct
{
    da_type *start; // Points to start of memory allocated region
    da_type *data;      // Points to logical start of array
    da_type *end;       // Points to end of memory allocated region
    size_t length;      // Length of the array
}
da;

// Initialize variable length array, allocate 2 blocks and put start pointer at the beginning
void da_init(da *da)
{
    da_type *ptr = malloc(ALLOC_BLOCK_SIZE * sizeof(da_type));
    if(ptr == 0) exit(1);
    da->start = ptr;
    da->data = ptr;
    da->end = da->start + ALLOC_BLOCK_SIZE;
    da->length = 0;
}

// Set the da size directly
void da_setlength(da *da, size_t newsize)
{
    if(newsize % ALLOC_BLOCK_SIZE != 0)
        newsize += ALLOC_BLOCK_SIZE - newsize % ALLOC_BLOCK_SIZE;

    ptrdiff_t offs = da->data - da->start;
    da_type *ptr = realloc(da->start, newsize * sizeof(da_type));
    if(!ptr) exit(1);

    da->start = ptr;
    da->data = ptr + offs;
    da->end = ptr + newsize;
    da->length = newsize;
}

// Destroy the variable length array (basically just frees memory)
void da_destroy(da* da)
{
    free(da->start);
#ifdef DEBUG_RUNTIME_CHECK_BOUNDS
    da->start = NULL;
    da->data = NULL;
    da->end = NULL;
    da->length = 0;
#endif
}

// Get an element of the array with it's index
#ifdef DEBUG_RUNTIME_CHECK_BOUNDS
    //Get an element of the array with bounds checking
    da_type da_get(da *da, unsigned int index)
    {
        if(index >= da->length)
        {
            printf("da error : index %u is out of bounds\n", index);
            exit(1);
        }
        return da->data[index];
    }

    //Set an element of the array with bounds checking
    void da_set(da *da, unsigned int index, da_type data)
    {
        if(index >= da->length)
        {
            printf("da error : index %u is out of bounds\n", index);
            exit(1);
        }
        da->data[index] = data;
    }
#else
    //Get an element of the array without bounds checking
    #define da_get(da, index) ((da)->data[(index)])

    //Set an element of the array without bounds checking
    #define da_set(da, index, v) (da_get(da, index) = v)
#endif


// Add an element at the end of the array
void da_add(da *da, da_type i)
{   // If no more memory
    if(da->data + da->length >= da->end)
    {   // Increase size of allocated memory block
        ptrdiff_t offset = da->data - da->start;
        ptrdiff_t newsize = da->end - da->start + ALLOC_BLOCK_SIZE;
        da_type *ptr = realloc(da->start, newsize * sizeof(da_type));
        if(!ptr) exit(1);

        da->data = ptr + offset;
        da->end = ptr + newsize;
        da->start = ptr;
    }
    da->data[da->length] = i;
    da->length += 1;
}

// Remove element at the end of the array (and returns it)
da_type da_remove(da *da)
{
#ifdef DEBUG_RUNTIME_CHECK_BOUNDS
    if(da->length == 0)
    {
        printf("Error - try to remove item from empty array");
        exit(1);
    }
#endif
    //Read last element of the array
    da->length -= 1;
    da_type ret_value = da->data[da->length];
    //Remove redundant memory if there is too much of it
    if(da->end - (da->data + da->length) > ALLOC_BLOCK_SIZE)
    {
        ptrdiff_t offset = da->data - da->start;
        ptrdiff_t newsize = da->end - da->start - ALLOC_BLOCK_SIZE;
        da_type *ptr = realloc(da->start, newsize * sizeof(da_type));
        if(!ptr) exit(1);

        da->data = ptr + offset;
        da->end = ptr + newsize;
        da->start = ptr;
    }
    return ret_value;
}

// Add element at the start of array
void da_push(da *da, da_type i)
{   //If array reaches bottom of the allocated space, we need to allocate more
    if(da->data == da->start)
    {
        ptrdiff_t newsize = da->end - da->start + ALLOC_BLOCK_SIZE;
        da_type *ptr = realloc(da->start, newsize * sizeof(da_type));
        if(!ptr) exit(1);
        memmove(ptr + ALLOC_BLOCK_SIZE, ptr, da->length * sizeof(da_type));

        da->data = ptr + ALLOC_BLOCK_SIZE;
        da->start = ptr;
        da->end = ptr + newsize;
    }
    // Store element at start of array
    da->length += 1;
    da->data -= 1;
    da->data[0] = i;
}

//Remove 1st element of array (and return it)
da_type da_pop(da *da)
{
#ifdef DEBUG_RUNTIME_CHECK_BOUNDS
    if(da->length == 0)
    {
        printf("Error - try to remove item from empty array");
        exit(1);
    }
#endif
    da_type ret_value = da->data[0];
    da->length -= 1;
    da->data += 1;
    ptrdiff_t offset = da->data - da->start;
    if(offset > ALLOC_BLOCK_SIZE)
    {
        ptrdiff_t newsize = da->end - da->start - ALLOC_BLOCK_SIZE;
        da_type *ptr = realloc(da->start, newsize * sizeof(da_type));
        if(!ptr) exit(1);
        memmove(ptr + offset - ALLOC_BLOCK_SIZE, ptr + offset, da->length * sizeof(da_type));

        da->data = ptr + offset - ALLOC_BLOCK_SIZE;
        da->start = ptr;
        da->end = ptr + newsize;
    }
    return ret_value;
}

// Append array t to s
void da_array_append(da *s, const da_type *t, size_t t_len)
{
    if((s->length + t_len) > (s->end - s->start))
    {   // Should reserve more space in the heap
        ptrdiff_t offset = s->data - s->start;
        ptrdiff_t newsize = s->length + t_len;
        // Guarantees that new size is multiple of alloc block size
        if(t_len % ALLOC_BLOCK_SIZE != 0)
            newsize += ALLOC_BLOCK_SIZE - (t_len % ALLOC_BLOCK_SIZE);

        da_type *ptr = malloc(newsize * sizeof(da_type));
        if(!ptr) exit(1);

        memcpy(ptr, s->data, s->length * sizeof(da_type));
        memcpy(ptr + s->length, t, t_len * sizeof(da_type));
        free(s->start);
        s->data = ptr;
        s->start = ptr;
        s->end = ptr + newsize;
    }
    else
        // Enough space in heap buffer -> do it the simple way
        memmove(s->data + s->length, t, t_len * sizeof(da_type));

    s->length += t_len;
}

// Append a da is a particular case of appending an array
#define da_append(s, t) da_array_append(s, (t)->data, (t)->length)
Bregalad
  • 634
  • 7
  • 15

2 Answers2

10

What I would do is fall back to preprocessor hackage. You can definitely achieve something like templates in C++ by adding type information when necessary.

struct da_impl {
    size_t len;
    size_t elem_size;
    size_t allocsize; // or whatever
};

void da_init_impl(struct da_impl *impl, size_t elem_size)
{
    impl->len = 0;
    impl->elem_size = elem_size;
    impl->allocsize = 0;
}

#define DA_TEMPLATE(t) struct { da_impl impl; t *data; }
#define da_init(a) da_init_impl(&a.impl, sizeof(*a.data))

// etc.

Then you can use it like this:

DA_TEMPLATE(int) intArray;
da_init(intArray);

da_append(intArray, 42);

int foo = intArray.data[0];

One drawback is that this creates an anonymous structure which you can't really use outside of its scope, but maybe you can live with that...

  • @FiddlingBits :) Unfortunately, spite downvotes by "others" (khm...) are incoming... :/ –  Dec 22 '13 at 23:58
  • 1
    While you're laughing your way to the bank... a few reputation points won't matter. :-D – Fiddling Bits Dec 23 '13 at 00:25
  • Ok, it's almost what I'd like. It's nice that the DA_struct has the real type you want in the main program (so that dereferencing works), and it's fine to call functions with the size as an extra argument that is hidden by the macro. This is a genius idea. HOWEVER I still have a problem with functions taking and returning values of the da type directly (namely, the add, remove, push and pop functions). I could pass pointers to void (and copy the correct # of bytes in the da), and get rid of the return value (it's not a necessity), but still it's not very elegant and not really what I'd like. – Bregalad Dec 24 '13 at 12:53
  • @user2537102 The facilities of C don't provide much more than this. If you are dissatisfied with the level of abstraction C can handle, then switch to C++. There are features (such as templates) which simply cannot be done in C. Just no. Period. –  Dec 24 '13 at 13:21
  • OK it's not a problem, you just told me the best that can be done in plain C so I'm instructed. – Bregalad Dec 24 '13 at 14:10
1

Use a union for your generic data...

typedef union
{
    int *pIntArr;
    double *pDblArr;
    struct *pStructArr;
    ... // Etc.
} genericData;

...and use a struct to hold this so you can also include what data your generic data union is holding and its length.

typedef struct
{
    genericData data;
    int dataType;   // 0 == int, 1 == double, 2 == etc.
    int dataLength; // Number of elements in array
} genericDataType;
Fiddling Bits
  • 8,712
  • 3
  • 28
  • 46
  • 2
    What if you want to store a really large struct?, you will waste memory if storing smaller objects. Also your solution requires a switch later in the code. – this Dec 22 '13 at 23:18
  • 1
    @self. Not sure what you mean about wasting memory. – Fiddling Bits Dec 22 '13 at 23:48
  • How large is an union with a char pointer and a char array of size 1000? – this Dec 22 '13 at 23:50
  • 2
    @self. There's no "waste of memory" -- the union contains pointers. It's not that there's an array of unions. There's a union of pointers to (the first elements of) arrays of different types. And that makes a huge difference. (But even if it was the case that you claimed -- why bother? Memory is cheap nowadays. Unless this apparently caused notable amounts of memory to be reserved superfluously, it's not worth worrying about the size of the union.) –  Dec 22 '13 at 23:52
  • `sizeof(genericData);` will always be the same. Of course, if you create large arrays of large data types, that'll consume more memory, but there's no way around that if you want to store it. – Fiddling Bits Dec 22 '13 at 23:52
  • @H2CO3 Elegantly said. – Fiddling Bits Dec 22 '13 at 23:53
  • I'm wondering though, why my answer has been downvoted, and why this got more upvotes, whereas this doesn't really answer OP's question... Mass psychology, I guess. –  Dec 23 '13 at 09:30
  • @H2CO3 I think mine is just easier to understand after a single glance. Yours take a little study... and a PhD. ;-) – Fiddling Bits Dec 23 '13 at 10:42
  • Sorry, but that's not what I want. Unions can't be dereferenced, and there is no such things as "union arithmetics". Usage would be a bit awkard but still okay by having : array.data.p_int[i], but the worst is for the library itself where there is now way to write stuff elegantly for pointer arithmetics. What about functions that return an array element ? – Bregalad Dec 23 '13 at 11:51