linked-list generic-programming c containers

“多用途” 纯C 中的链表实现

发布于 2024-07-17 06:57:16 字数 931 浏览 9 评论 0原文

这不完全是一个技术问题，因为我对 C 的了解足以完成我需要做的事情（我的意思是，不“让语言妨碍你”），所以这个问题基本上是“什么方向”接受'问题。

情况是：我目前正在学习高级算法课程，为了“成长为程序员”，我被要求使用纯C来实现实际作业（效果很好：几乎任何你犯的小错误实际上都会迫使我使用纯C语言来实现实际作业）您完全了解自己正在做什么以解决问题）。在实现过程中，我显然遇到了必须从头开始实现“基本”数据结构的问题：实际上不仅是链表，还包括堆栈、树等。

我重点关注本主题中的列表，因为它通常是我最终在程序中大量使用的结构，无论是作为“主”结构还是作为其他较大结构的“辅助”结构（例如，解析的哈希树）使用链表会发生冲突）。

这要求列表存储许多不同类型的元素。我在这里假设我不想为每种类型重新编码列表。所以，我可以想出这些替代方案：

制作一个 void 指针列表（有点不雅；更难调试）
只制作一个列表，但有一个 union 作为“元素类型”，包含所有元素类型我将在程序中使用（更容易调试；如果元素大小不一样则浪费空间）
使用预处理器宏为每种类型重新生成代码，风格为 SGLIB，“模仿”C++ 的 STL（创造性的解决方案；不浪费空间；元素在返回时具有它们实际的显式类型；列表中的任何更改代码可以非常戏剧化）
你的想法/解决方案

为了明确问题：上面哪一个最好？

PS：由于我基本上处于学术背景，所以我也很对业内使用纯 C 的人的观点感兴趣。据我了解，大多数纯 C 程序员都在嵌入式设备领域，我认为我面临的这种问题并不常见。然而，如果有人知道“在现实世界中”是如何做到的，我会对你的意见非常感兴趣。

原文

This is not exactly a technical question, since I know C kind of enough to do the things I need to (I mean, in terms of not 'letting the language get in your way'), so this question is basically a 'what direction to take' question.

Situation is: I am currently taking an advanced algorithms course, and for the sake of 'growing up as programmers', I am required to use pure C to implement the practical assignments (it works well: pretty much any small mistake you make actually forces you to understand completely what you're doing in order to fix it). In the course of implementing, I obviously run into the problem of having to implement the 'basic' data structures from the ground up: actually not only linked lists, but also stacks, trees, et cetera.

I am focusing on lists in this topic because it's typically a structure I end up using a lot in the program, either as a 'main' structure or as a 'helper' structure for other bigger ones (for example, a hash tree that resolves conflicts by using a linked list).

This requires that the list stores elements of lots of different types. I am assuming here as a premise that I don't want to re-code the list for every type. So, I can come up with these alternatives:

Making a list of void pointers (kinda inelegant; harder to debug)
Making only one list, but having a union as 'element type', containing all element types I will use in the program (easier to debug; wastes space if elements are not all the same size)
Using a preprocessor macro to regenerate the code for every type, in the style of SGLIB, 'imitating' C++'s STL (creative solution; doesn't waste space; elements have the explicit type they actually are when they are returned; any change in list code can be really dramatic)
Your idea/solution

To make the question clear: which one of the above is best?

PS: Since I am basically in an academic context, I am also very interested in the view of people working with pure C out there in the industry. I understand that most pure C programmers are in the embedded devices area, where I don't think this kind of problem I am facing is common. However, if anyone out there knows how it's done 'in the real world', I would be very interested in your opinion.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

北斗星光 2024-07-24 06:57:16

void * 在链表中有点麻烦，因为您必须单独管理它对列表本身的分配。我过去使用的一种方法是拥有一个“可变大小”结构，例如：

typedef struct _tNode {
    struct _tNode *prev;
    struct _tNode *next;
    int payloadType;
    char payload[1];  // or use different type for alignment.
} tNode;

现在我意识到这看起来不像可变大小，但让我们这样分配一个结构：

typedef struct {
    char Name[30];
    char Addr[50];
} tPerson;
tNode *node = malloc (sizeof (tNode) - 1 + sizeof (tPerson));

现在你有一个节点，无论出于何种意图和目的，看起来像这样：

typedef struct _tNode {
    struct _tNode *prev;
    struct _tNode *next;
    int payloadType;
    char Name[30];
    char Addr[50];
} tNode;

或者，以图形形式（其中 [n] 表示 n 字节）：

+----------------+
|    prev[4]     |
+----------------+
|    next[4]     |
+----------------+
| payloadType[4] |                
+----------------+                +----------+
|   payload[1]   | <- overlap ->  | Name[30] |
+----------------+                +----------+
                                  | Addr[50] |
                                  +----------+

也就是说，假设您知道如何解决有效负载正确。这可以按如下方式完成：

node->prev = NULL;
node->next = NULL;
node->payloadType = PLTYP_PERSON;
tPerson *person = &(node->payload); // cast for easy changes to payload.
strcpy (person->Name, "Bob Smith");
strcpy (person->Addr, "7 Station St");

该转换行只是将 payload 字符（在 tNode 类型中）的地址转换为实际 tPerson< 的地址。 /code> 有效负载类型。

使用此方法，您可以在节点中携带所需的任何有效负载类型，甚至每个节点中可以携带不同的有效负载类型，而不会浪费联合空间。这种浪费可以通过以下方式看出：

union {
    int x;
    char y[100];
} u;

每次在列表中存储整数类型（对于 4 字节整数）时，都会浪费 96 个字节。

tNode 中的有效负载类型使您可以轻松检测该节点携带的有效负载类型，以便您的代码可以决定如何处理它。您可以使用以下内容：

#define PAYLOAD_UNKNOWN     0
#define PAYLOAD_MANAGER     1
#define PAYLOAD_EMPLOYEE    2
#define PAYLOAD_CONTRACTOR  3

或（可能更好）：

typedef enum {
    PAYLOAD_UNKNOWN,
    PAYLOAD_MANAGER,
    PAYLOAD_EMPLOYEE,
    PAYLOAD_CONTRACTOR
} tPayLoad;

A void * is a bit of a pain in a linked list since you have to manage it's allocation separately to the list itself. One approach I've used in the past is to have a 'variable sized' structure like:

typedef struct _tNode {
    struct _tNode *prev;
    struct _tNode *next;
    int payloadType;
    char payload[1];  // or use different type for alignment.
} tNode;

Now I realize that doesn't look variable sized but let's allocate a structure thus:

typedef struct {
    char Name[30];
    char Addr[50];
} tPerson;
tNode *node = malloc (sizeof (tNode) - 1 + sizeof (tPerson));

Now you have a node that, for all intents and purposes, looks like this:

typedef struct _tNode {
    struct _tNode *prev;
    struct _tNode *next;
    int payloadType;
    char Name[30];
    char Addr[50];
} tNode;

or, in graphical form (where [n] means n bytes):

+----------------+
|    prev[4]     |
+----------------+
|    next[4]     |
+----------------+
| payloadType[4] |                
+----------------+                +----------+
|   payload[1]   | <- overlap ->  | Name[30] |
+----------------+                +----------+
                                  | Addr[50] |
                                  +----------+

That is, assuming you know how to address the payload correctly. This can be done as follows:

node->prev = NULL;
node->next = NULL;
node->payloadType = PLTYP_PERSON;
tPerson *person = &(node->payload); // cast for easy changes to payload.
strcpy (person->Name, "Bob Smith");
strcpy (person->Addr, "7 Station St");

That cast line simply casts the address of the payload character (in the tNode type) to be an address of the actual tPerson payload type.

Using this method, you can carry any payload type you want in a node, even different payload types in each node, without the wasted space of a union. This wastage can be seen with the following:

union {
    int x;
    char y[100];
} u;

where 96 bytes are wasted every time you store an integer type in the list (for a 4-byte integer).

The payload type in the tNode allows you to easily detect what type of payload this node is carrying, so your code can decide how to process it. You can use something along the lines of:

#define PAYLOAD_UNKNOWN     0
#define PAYLOAD_MANAGER     1
#define PAYLOAD_EMPLOYEE    2
#define PAYLOAD_CONTRACTOR  3

or (probably better):

typedef enum {
    PAYLOAD_UNKNOWN,
    PAYLOAD_MANAGER,
    PAYLOAD_EMPLOYEE,
    PAYLOAD_CONTRACTOR
} tPayLoad;

回复收藏 0 原文

携余温的黄昏 2024-07-24 06:57:16

我的 $.002：

制作一个 void 指针列表（有点不雅；更难调试）

如果您必须用 C 语言编写，恕我直言，这不是一个糟糕的选择。您可以添加 API 方法以允许应用程序提供打印() 方法以便于调试。当（例如）项目被添加到列表或从列表中删除时，可以调用类似的方法。（对于链表，这通常是不必要的，但对于更复杂的数据结构——例如哈希表）——它有时可以成为救星。）

只创建一个列表，但有一个联合作为“元素类型”，包含我将在程序中使用的所有元素类型（更容易调试；如果元素大小不全则浪费空间）

我会像瘟疫一样避免这种情况。（嗯，你确实问过。）从数据结构到其包含的类型的手动配置的编译时依赖是最糟糕的。再说一遍，恕我直言。

使用预处理器宏为每种类型重新生成代码，采用 SGLIB (sglib.sourceforge.net) 的风格，“模仿”C++ 的 STL（创造性解决方案；不浪费空间；元素具有它们实际存在的显式类型）返回；列表代码中的任何更改都可能非常引人注目）

有趣的想法，但由于我不了解 SGLIB，因此我不能说更多。

你的想法/解决方案

我会选择第一个选择。

回复收藏 0 原文

萌逼全场 2024-07-24 06:57:16

我过去在我们的代码（现已转换为 C++）中这样做过，当时决定采用 void* 方法。我这样做只是为了灵活性 - 无论如何，我们几乎总是在列表中存储指针，并且解决方案的简单性和它的可用性（对我来说）超过了其他方法的缺点。

话虽这么说，有一次它导致了一些难以调试的令人讨厌的错误，所以它绝对不是一个完美的解决方案。不过，如果我现在再做一次，我想我仍然会选择它。

回复收藏 0 原文

风为裳 2024-07-24 06:57:16

使用预处理器宏是最好的选择。 Linux 内核链表是循环链接的一个优秀且高效的实现C 中的列表。非常便携且易于使用。这里 Linux 内核 2.6.29 的独立版本 list.h 标头。

FreeBSD/OpenBSD sys/queue 是另一个不错的选择基于通用宏的链表

回复收藏 0 原文

梦屿孤独相伴 2024-07-24 06:57:16

我已经很多年没有编写 C 代码了，但是 GLib 声称提供“一大套实用程序”字符串和常见数据结构的函数”，其中包括链表。

回复收藏 0 原文

〆一缕阳光ご 2024-07-24 06:57:16

尽管人们很容易想到使用另一种语言的技术（例如泛型）来解决此类问题，但实际上这很少是成功的。可能有一些固定的解决方案在大多数情况下都是正确的（并在他们出错时在他们的文档中告诉你），使用它可能会错过作业的重点，所以我会三思而后行。对于极少数情况，自行推出可能是可行的，但对于任何合理规模的项目，它可能不值得进行调试工作。

相反，当使用 x 语言编程时，您应该使用 x 语言的习惯用法。使用Python时不要编写java。使用scheme时不要写C。使用 C99 时不要编写 C++。

我自己，我可能最终会使用类似 Pax 的建议，但实际上使用 char[1] 和 void* 和 int 的联合，以使常见情况变得方便（以及枚举类型标志）

（我也可能会结束实现斐波那契树，只是因为这听起来很简洁，而且你只能实现 RB 树很多次，然后它就会失去它的味道，即使这对于它的常见情况更好。）

编辑： 根据您的评论，看起来您有一个使用罐装解决方案的很好的案例。如果您的老师允许，并且它提供的语法感觉很舒服，请尝试一下。

回复收藏 0 原文

舂唻埖巳落 2024-07-24 06:57:16

这是一个好问题。我喜欢两种解决方案：

Dave Hanson 的 C 接口和实现使用void * 指针列表，这对我来说已经足够了。
对于我的学生，我编写了一个awk 脚本来生成特定于类型的列表函数。与预处理器宏相比，它需要额外的构建步骤，但系统的操作对于没有太多经验的程序员来说要透明得多。这确实有助于论证参数多态性，他们稍后会在课程中看到这一点。
这是一组函数的样子：
```
int lengthEL (Explist *l); 
  Exp* nthEL (Explist *l, 无符号n); 
  Explist *mkEL (Exp *hd, Explist *tl); 
  
```
awk 脚本有 150 行，令人恐怖；它在 C 代码中搜索 typedef 并为每个类型生成一组列表函数。它很旧了；我现在可能可以做得更好:-)

我不会在一天中的某个时间（或硬盘上的空间）提供工会列表。它不安全，并且不可扩展，因此您最好只使用 void * 并完成它。

This is a good problem. There are two solutions I like:

Dave Hanson's C Interfaces and Implementations uses a list of void * pointers, which is good enough for me.
For my students, I wrote an awk script to generate type-specific list functions. Compared to preprocessor macros, it requires an extra build step, but the operation of the system is much more transparent to programmers without a lot of experience. And it really helps make the case for parametric polymorphism, which they see later in their curriculum.
Here's what one set of functions looks like:
```
int      lengthEL (Explist *l);
Exp*     nthEL    (Explist *l, unsigned n);
Explist *mkEL     (Exp *hd, Explist *tl);
```
The awk script is a 150-line horror; it searches C code for typedefs and generates a set of list functions for each one. It's very old; I could probably do better now :-)

I wouldn't give a list of unions the time of day (or space on my hard drive). It's not safe, and it's not extensible, so you may as well just use void * and be done with it.

回复收藏 0 原文