- Learn C The Hard Way
- Preface
- Introduction: The Cartesian Dream Of C
- Exercise 0: The Setup
- Exercise 1: Dust Off That Compiler
- Exercise 2: Make Is Your Python Now
- Exercise 3: Formatted Printing
- Exercise 4: Introducing Valgrind
- Exercise 5: The Structure Of A C Program
- Exercise 6: Types Of Variables
- Exercise 7: More Variables, Some Math
- Exercise 8: Sizes And Arrays
- Exercise 9: Arrays And Strings
- Exercise 10: Arrays Of Strings, Looping
- Exercise 11: While-Loop And Boolean Expressions
- Exercise 12: If, Else-If, Else
- Exercise 13: Switch Statement
- Exercise 14: Writing And Using Functions
- Exercise 15: Pointers Dreaded Pointers
- Exercise 16: Structs And Pointers To Them
- Exercise 17: Heap And Stack Memory Allocation
- Exercise 18: Pointers To Functions
- Exercise 19: A Simple Object System
- Exercise 20: Zed's Awesome Debug Macros
- Exercise 21: Advanced Data Types And Flow Control
- Exercise 22: The Stack, Scope, And Globals
- Exercise 23: Meet Duff's Device
- Exercise 24: Input, Output, Files
- Exercise 25: Variable Argument Functions
- Exercise 26: Write A First Real Program
- Exercise 27: Creative And Defensive Programming
- Exercise 28: Intermediate Makefiles
- Exercise 29: Libraries And Linking
- Exercise 30: Automated Testing
- Exercise 31: Debugging Code
- Exercise 32: Double Linked Lists
- Exercise 33: Linked List Algorithms
- Exercise 34: Dynamic Array
- Exercise 35: Sorting And Searching
- Exercise 36: Safer Strings
- Exercise 37: Hashmaps
- Exercise 38: Hashmap Algorithms
- Exercise 39: String Algorithms
- Exercise 40: Binary Search Trees
- Exercise 41: Using Cachegrind And Callgrind For Performance Tuning
- Exercise 42: Stacks and Queues
- Exercise 43: A Simple Statistics Engine
- Exercise 44: Ring Buffer
- Exercise 45: A Simple TCP/IP Client
- Exercise 46: Ternary Search Tree
- Exercise 47: A Fast URL Router
- Exercise 48: A Tiny Virtual Machine Part 1
- Exercise 48: A Tiny Virtual Machine Part 2
- Exercise 50: A Tiny Virtual Machine Part 3
- Exercise 51: A Tiny Virtual Machine Part 4
- Exercise 52: A Tiny Virtual Machine Part 5
- Next Steps
- Deconstructing K & RC Is Dead
Exercise 34: Dynamic Array
This is an array that grows on its own and has most of the same features as a linked list. It will usually take up less space, run faster, and has other beneficial properties. This exercise will cover a few of the disadvantages like very slow removal from the front, with a solution (just do it at the end).
A dynamic array is simply an array of void **
pointers that is pre-allocated in one shot and that point at the data. In the linked list you had a full struct that stored the void *value
pointer, but in a dynamic array there's just a single array with all of them. This means you don't need any other pointers for next and previous records since you can just index into it directly.
To start, I'll give you the header file you should type up for the implementation:
#ifndef _DArray_h
#define _DArray_h
#include <stdlib.h>
#include <assert.h>
#include <lcthw/dbg.h>
typedef struct DArray {
int end;
int max;
size_t element_size;
size_t expand_rate;
void **contents;
} DArray;
DArray *DArray_create(size_t element_size, size_t initial_max);
void DArray_destroy(DArray *array);
void DArray_clear(DArray *array);
int DArray_expand(DArray *array);
int DArray_contract(DArray *array);
int DArray_push(DArray *array, void *el);
void *DArray_pop(DArray *array);
void DArray_clear_destroy(DArray *array);
#define DArray_last(A) ((A)->contents[(A)->end - 1])
#define DArray_first(A) ((A)->contents[0])
#define DArray_end(A) ((A)->end)
#define DArray_count(A) DArray_end(A)
#define DArray_max(A) ((A)->max)
#define DEFAULT_EXPAND_RATE 300
static inline void DArray_set(DArray *array, int i, void *el)
{
check(i < array->max, "darray attempt to set past max");
if(i > array->end) array->end = i;
array->contents[i] = el;
error:
return;
}
static inline void *DArray_get(DArray *array, int i)
{
check(i < array->max, "darray attempt to get past max");
return array->contents[i];
error:
return NULL;
}
static inline void *DArray_remove(DArray *array, int i)
{
void *el = array->contents[i];
array->contents[i] = NULL;
return el;
}
static inline void *DArray_new(DArray *array)
{
check(array->element_size > 0, "Can't use DArray_new on 0 size darrays.");
return calloc(1, array->element_size);
error:
return NULL;
}
#define DArray_free(E) free((E))
#endif
This header file is showing you a new technique where I put static inline
functions right in the header. These function definitions will work similar to the #define
macros you've been making, but they're cleaner and easier to write. If you need to create a block of code for a macro and you don't need code generation, then use a static inline
function.
Compare this technique to the LIST_FOREACH
that generates a proper for-loop for a list. This would be impossible to do with a static inline
function because it actually has to generate the inner block of code for the loop. The only way to do that is with a callback function, but that's not as fast and is harder to use.
I'll then change things up and have you create the unit test for DArray
:
#include "minunit.h"
#include <lcthw/darray.h>
static DArray *array = NULL;
static int *val1 = NULL;
static int *val2 = NULL;
char *test_create()
{
array = DArray_create(sizeof(int), 100);
mu_assert(array != NULL, "DArray_create failed.");
mu_assert(array->contents != NULL, "contents are wrong in darray");
mu_assert(array->end == 0, "end isn't at the right spot");
mu_assert(array->element_size == sizeof(int), "element size is wrong.");
mu_assert(array->max == 100, "wrong max length on initial size");
return NULL;
}
char *test_destroy()
{
DArray_destroy(array);
return NULL;
}
char *test_new()
{
val1 = DArray_new(array);
mu_assert(val1 != NULL, "failed to make a new element");
val2 = DArray_new(array);
mu_assert(val2 != NULL, "failed to make a new element");
return NULL;
}
char *test_set()
{
DArray_set(array, 0, val1);
DArray_set(array, 1, val2);
return NULL;
}
char *test_get()
{
mu_assert(DArray_get(array, 0) == val1, "Wrong first value.");
mu_assert(DArray_get(array, 1) == val2, "Wrong second value.");
return NULL;
}
char *test_remove()
{
int *val_check = DArray_remove(array, 0);
mu_assert(val_check != NULL, "Should not get NULL.");
mu_assert(*val_check == *val1, "Should get the first value.");
mu_assert(DArray_get(array, 0) == NULL, "Should be gone.");
DArray_free(val_check);
val_check = DArray_remove(array, 1);
mu_assert(val_check != NULL, "Should not get NULL.");
mu_assert(*val_check == *val2, "Should get the first value.");
mu_assert(DArray_get(array, 1) == NULL, "Should be gone.");
DArray_free(val_check);
return NULL;
}
char *test_expand_contract()
{
int old_max = array->max;
DArray_expand(array);
mu_assert((unsigned int)array->max == old_max + array->expand_rate, "Wrong size after expand.");
DArray_contract(array);
mu_assert((unsigned int)array->max == array->expand_rate + 1, "Should stay at the expand_rate at least.");
DArray_contract(array);
mu_assert((unsigned int)array->max == array->expand_rate + 1, "Should stay at the expand_rate at least.");
return NULL;
}
char *test_push_pop()
{
int i = 0;
for(i = 0; i < 1000; i++) {
int *val = DArray_new(array);
*val = i * 333;
DArray_push(array, val);
}
mu_assert(array->max == 1201, "Wrong max size.");
for(i = 999; i >= 0; i--) {
int *val = DArray_pop(array);
mu_assert(val != NULL, "Shouldn't get a NULL.");
mu_assert(*val == i * 333, "Wrong value.");
DArray_free(val);
}
return NULL;
}
char * all_tests() {
mu_suite_start();
mu_run_test(test_create);
mu_run_test(test_new);
mu_run_test(test_set);
mu_run_test(test_get);
mu_run_test(test_remove);
mu_run_test(test_expand_contract);
mu_run_test(test_push_pop);
mu_run_test(test_destroy);
return NULL;
}
RUN_TESTS(all_tests);
This shows you how all of the operations are used, which then makes implementing the DArray
much easier:
#include <lcthw/darray.h>
#include <assert.h>
DArray *DArray_create(size_t element_size, size_t initial_max)
{
DArray *array = malloc(sizeof(DArray));
check_mem(array);
array->max = initial_max;
check(array->max > 0, "You must set an initial_max > 0.");
array->contents = calloc(initial_max, sizeof(void *));
check_mem(array->contents);
array->end = 0;
array->element_size = element_size;
array->expand_rate = DEFAULT_EXPAND_RATE;
return array;
error:
if(array) free(array);
return NULL;
}
void DArray_clear(DArray *array)
{
int i = 0;
if(array->element_size > 0) {
for(i = 0; i < array->max; i++) {
if(array->contents[i] != NULL) {
free(array->contents[i]);
}
}
}
}
static inline int DArray_resize(DArray *array, size_t newsize)
{
array->max = newsize;
check(array->max > 0, "The newsize must be > 0.");
void *contents = realloc(array->contents, array->max * sizeof(void *));
// check contents and assume realloc doesn't harm the original on error
check_mem(contents);
array->contents = contents;
return 0;
error:
return -1;
}
int DArray_expand(DArray *array)
{
size_t old_max = array->max;
check(DArray_resize(array, array->max + array->expand_rate) == 0,
"Failed to expand array to new size: %d",
array->max + (int)array->expand_rate);
memset(array->contents + old_max, 0, array->expand_rate + 1);
return 0;
error:
return -1;
}
int DArray_contract(DArray *array)
{
int new_size = array->end < (int)array->expand_rate ? (int)array->expand_rate : array->end;
return DArray_resize(array, new_size + 1);
}
void DArray_destroy(DArray *array)
{
if(array) {
if(array->contents) free(array->contents);
free(array);
}
}
void DArray_clear_destroy(DArray *array)
{
DArray_clear(array);
DArray_destroy(array);
}
int DArray_push(DArray *array, void *el)
{
array->contents[array->end] = el;
array->end++;
if(DArray_end(array) >= DArray_max(array)) {
return DArray_expand(array);
} else {
return 0;
}
}
void *DArray_pop(DArray *array)
{
check(array->end - 1 >= 0, "Attempt to pop from empty array.");
void *el = DArray_remove(array, array->end - 1);
array->end--;
if(DArray_end(array) > (int)array->expand_rate && DArray_end(array) % array->expand_rate) {
DArray_contract(array);
}
return el;
error:
return NULL;
}
This shows you another way to tackle complex code. Instead of diving right into the .c
implementation, look at the header file, then read the unit test. This gives you an "abstract to concrete" understanding how the pieces work together and making it easier to remember.
Advantages And Disadvantages
A DArray
is better when you need to optimize these operations:
- Iteration. You can just use a basic for-loop and
DArray_count
withDArray_get
and you're done. No special macros needed, and it's faster because you aren't walking pointers. - Indexing. You can use
DArray_get
andDArray_set
to access any element at random, but with aList
you have to go through N elements to get to N+1. - Destroying. You just free the struct and the
contents
in two operations. AList
requires a series offree
calls and also walking every element. - Cloning. You can also clone it in just two operations (plus whatever it's storing) by copying the struct and
contents
. A list again requires walking the whole thing and copying everyListNode
plus its value. - Sorting. As you saw,
List
is horrible if you need to keep the data sorted. ADArray
opens up a whole class of great sorting algorithms because now you can access elements randomly. - Large Data. If you need to keep around a lot of data, then a
DArray
wins since it's basecontents
takes up less memory than the same number ofListNode
structs.
The List
however wins on these operations:
- Insert and remove on the front (what I called shift). A
DArray
needs special treatment to be able to do this efficiently, and usually has to do some copying. - Splitting or joining. A
List
can just copy some pointers and it's done, but with aDArray
you have to do copying of the arrays involved. - Small Data. If you only need to store a few elements, then typically the storage will be less in a
List
than a genericDArray
because theDArray
needs to expand the backing store to accommodate future inserts, but aList
only makes what it needs.
With this, I prefer to use a DArray
for most of the things you see other people use a List
. I reserve using List
for any data structure that requires small number of nodes that are inserted and removed from either end. I'll show you two similar data structures called a Stack
and Queue
where this is important.
How To Improve It
As usual, go through each function and operation and add the defensive programming checks, pre-conditions, invariants, and anything else you can find to make the implementation more bulletproof.
Extra Credit
- Improve the unit tests to cover more of the operations and test that using a for-loop to iterate works.
- Research what it would take to implement bubble sort and merge sort for DArray, but don't do it yet. I'll be implementing DArray algorithms next and you'll do this then.
- Write some performance tests for common operations and compare them to the same operations in
List
. You did some of this, but this time, write a unit test that repeatedly does the operation in question, then in the main runner do the timing. - Look at how the
DArray_expand
is implemented using a constant increase (size + 300). Typically dynamic arrays are implemented with a multiplicative increase (size * 2), but I've found this to cost needless memory for no real performance gain. Test my assertion and see when you'd want a multiplied increase instead of a constant increase.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论