命名全局范围的一次性标识符

发布于 2024-09-13 17:49:11 字数 2008 浏览 8 评论 0原文

有时，当使用宏生成代码时，有必要创建具有全局范围的标识符，但这些标识符对于创建它们的直接上下文之外的任何内容都没有真正的用处。例如，假设需要在编译时将数组或其他索引资源分配为各种大小的块。

/* Produce an enumeration of some story-book characters, and allocate some
   arbitrary index resource to them.  Both enumeration and resource indices
   will start at zero.

   For each name, defines HPID_xxxx to be the enumeration of that name.
   Also defines HP_ID_COUNT to be the total number of names, and
   HP_TOTAL_SIZE to be the total resource requirement, and creates an
   array hp_starts[HP_ID_COUNT+1].  Each character n is allocated resources
   from hp_starts[n] through (but not including) hp_starts[n+1].
*/

/* Give the names and their respective lengths */
#define HP_LIST \
  HP_ITEM(FRED, 4) \
  HP_ITEM(GEORGE, 6) \
  HP_ITEM(HARRY, 5) \
  HP_ITEM(RON, 3) \
  HP_ITEM(HERMIONE, 8) \
  /* BLANK LINE REQUIRED TO ABSORB LAST BACKSLASH */

#define HP_ITEM(name, length) HPID_##name,
typedef enum { HP_LIST HP_ID_COUNT} HP_ID;
#undef HP_ITEM

#define HP_ITEM(name, length) ZZQ_##name}; enum {ZZQX_##name=ZZQ_##name+(length)-1,
enum { HP_LIST HP_TOTAL_SIZE};
#undef HP_ITEM

#define HP_ITEM(name, length) ZZQ_##name,
const unsigned char hp_starts[] = { HP_LIST HP_TOTAL_SIZE};
#undef HP_ITEM

#include "stdio.h"

void main(void)
{
  int i;

  printf("ID count=%d Total size=%d\n",HP_ID_COUNT,HP_TOTAL_SIZE);
  for (i=0; HP_ID_COUNT > i; i++) /* Reverse conditional to avoid lt sign */
    printf("  %2d=%3d/%3d\n", i, hp_starts[i], hp_starts[i+1]-hp_starts[i]);
  printf("IDs are: \n");
#define HP_ITEM(name, length) printf("  %2d=%s\n",HPID_##name, #name);
  HP_LIST
#undef HP_ITEMS

}

是否有任何正常的约定来命名此类标识符，以尽量减少冲突的可能性，并尽量减少它们可能产生的任何混乱？在上述场景中，标识符 ZZQ_xxx 将与 hp_starts[HPID_xxx] 相同，并且在某些上下文中可能有用，尽管它们的主要目的是构建数组并在计算其他 ZZQ 值和 HP_TOTAL_SIZE 时用作占位符。然而，标识符 ZZQX_xxx 是没有用的；它们的唯一目的是在为后续项目设置枚举值时充当占位符。有什么好的方法来命名这些东西吗？

顺便说一句，我为小型微控制器开发，因为 RAM 比代码空间更有价值。代码是通过在 Microsoft VC++ 上编译来模拟的，但对于生产来说，是使用直接 C 语言的交叉编译器进行编译的；因此，代码必须用 C 和 C++ 编译。

对于类似的任务，人们还可以推荐其他预处理器技巧吗？

原文

Sometimes, when using macros to generate code, it is necessary to create identifiers that have global scope but which aren't really useful for anything outside the immediate context where they are created. For example, suppose it's necessary to compile-time-allocate an array or other indexed resource into chunks of various sizes.

/* Produce an enumeration of some story-book characters, and allocate some
   arbitrary index resource to them.  Both enumeration and resource indices
   will start at zero.

   For each name, defines HPID_xxxx to be the enumeration of that name.
   Also defines HP_ID_COUNT to be the total number of names, and
   HP_TOTAL_SIZE to be the total resource requirement, and creates an
   array hp_starts[HP_ID_COUNT+1].  Each character n is allocated resources
   from hp_starts[n] through (but not including) hp_starts[n+1].
*/

/* Give the names and their respective lengths */
#define HP_LIST \
  HP_ITEM(FRED, 4) \
  HP_ITEM(GEORGE, 6) \
  HP_ITEM(HARRY, 5) \
  HP_ITEM(RON, 3) \
  HP_ITEM(HERMIONE, 8) \
  /* BLANK LINE REQUIRED TO ABSORB LAST BACKSLASH */

#define HP_ITEM(name, length) HPID_##name,
typedef enum { HP_LIST HP_ID_COUNT} HP_ID;
#undef HP_ITEM

#define HP_ITEM(name, length) ZZQ_##name}; enum {ZZQX_##name=ZZQ_##name+(length)-1,
enum { HP_LIST HP_TOTAL_SIZE};
#undef HP_ITEM

#define HP_ITEM(name, length) ZZQ_##name,
const unsigned char hp_starts[] = { HP_LIST HP_TOTAL_SIZE};
#undef HP_ITEM

#include "stdio.h"

void main(void)
{
  int i;

  printf("ID count=%d Total size=%d\n",HP_ID_COUNT,HP_TOTAL_SIZE);
  for (i=0; HP_ID_COUNT > i; i++) /* Reverse conditional to avoid lt sign */
    printf("  %2d=%3d/%3d\n", i, hp_starts[i], hp_starts[i+1]-hp_starts[i]);
  printf("IDs are: \n");
#define HP_ITEM(name, length) printf("  %2d=%s\n",HPID_##name, #name);
  HP_LIST
#undef HP_ITEMS

}

Is there any normal convention for naming such identifiers to minimize the likelihood of conflicts, and also to minimize any confusion they might generate? In the above scenario, identifiers ZZQ_xxx will be the same as hp_starts[HPID_xxx], and might in some contexts be useful, though their primary purpose is to build the array and serve as placeholders in computing other ZZQ values and HP_TOTAL_SIZE. Identifiers ZZQX_xxx are useless, however; their sole purpose is to serve as placeholders when set the enumeration values for the succeeding items. Is there any good way to name such things?

Incidentally, I develop for small microcontrollers were RAM is at a greater premium than code space. Code is simulated by compiling on Microsoft VC++, but for production is compiled using a cross-compiler in straight C; code must thus compile in both C and C++.

Are there any other preprocessor tricks people can recommend for similar tasks?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

苦行僧 2024-09-20 17:49:11

是否有任何正常的约定来命名此类标识符，以尽量减少冲突的可能性，并尽量减少它们可能产生的任何混乱？

这一切都归结为您要使用的前缀。理想情况下，人们希望所有符号都能轻松地与它们相关的列表 (HP_LIST) 关联起来。

那么为什么不将这些符号放在同一个 HP_ 前缀下呢？例如前缀HP__ZZQX_，用于区分有用和无用的符号。

注意：我曾经参与过一个项目，其中一个共享库已经在（内部）使用 zzqx_ 前缀，它总是显示在应用程序符号表的末尾。在争夺不太可能被使用的名字的竞赛中，显然许多人都走同样的路线（拉丁字母的末尾）并最终得到完全相同的名字。与期望的结果相反。这就是为什么我认为名称空间（或 C 中的符号前缀）不应该隐藏/埋藏在定义中，而应该显式定义（例如易于查找和提取）。

作为具体内容，这里是通过破解 ## 使用作为预处理器定义给出的前缀生成名称：

/* the hack is needed to force the LIST_NAME to be expanded. 
   automatically adds underscores. yes, it's ugly */
#define LIST_SYMBOL_1(n1,n2,n3) n1##_##n2##_##n3
#define LIST_SYMBOL_0(n1,n2,n3) LIST_SYMBOL_1(n1,n2,n3)
#define LIST_SYMBOL(pref,name)  LIST_SYMBOL_0(LIST_NAME,pref,name)

/* give the name to the list. used by the LIST_SYMBOL(). */
#define LIST_NAME   HP

/* Give the names and their respective lengths */
#define HP_LIST \
  HP_ITEM(FRED, 4) \
  HP_ITEM(GEORGE, 6) \
  HP_ITEM(HARRY, 5) \
  HP_ITEM(RON, 3) \
  HP_ITEM(HERMIONE, 8) \
  /* BLANK LINE REQUIRED TO ABSORB LAST BACKSLASH */

#define HP_ITEM(name, length) HPID_##name,
typedef enum { HP_LIST HP_ID_COUNT} HP_ID;
#undef HP_ITEM

#define HP_ITEM(name, length)   LIST_SYMBOL(ZZQ,name)}; \
    enum {LIST_SYMBOL(ZZQX,name)=LIST_SYMBOL(ZZQ,name)+(length)-1,
enum { HP_LIST HP_TOTAL_SIZE};
#undef HP_ITEM

#define HP_ITEM(name, length) LIST_SYMBOL(ZZQ,name),
const unsigned char hp_starts[] = { HP_LIST HP_TOTAL_SIZE};
#undef HP_ITEM

#include <stdio.h>

void main(void)
{
  int i;
  printf("ID count=%d Total size=%d\n",HP_ID_COUNT,HP_TOTAL_SIZE);
  for (i=0; i<HP_ID_COUNT ; i++) /* bring the < back, SO is smart enough */
    printf("  %2d=%3d/%3d\n", i, hp_starts[i], hp_starts[i+1]-hp_starts[i]);
  printf("IDs are: \n");
#define HP_ITEM(name, length) printf("  %2d=%s\n",HPID_##name, #name);
  HP_LIST
#undef HP_ITEMS
}

编辑 1。我更喜欢的方法是将数据放入适当的文本文件中，例如：（

FRED
GEORGE
HARRY
RON
HERMIONE

请注意，您不再需要长度）并编写一个脚本（甚至是一个简单的 C 程序）来从文本文件生成源代码，创建必要的标头（带有枚举+数据声明）和源文件（带有数据）。修改 Makefile 以在编译任何源之前运行脚本，并将生成的源文件添加到已编译源列表中。

这有巨大的优点，即生成的代码是纯代码，并且可以这样索引（除非您喜欢“那个该死的 id 来自哪里？”的乐趣）。内部常量不再出现在源代码中，因为脚本会处理它们。不再有丑陋的预处理器魔法了。

Is there any normal convention for naming such identifiers to minimize the likelihood of conflicts, and also to minimize any confusion they might generate?

It all boils down to the prefix you want to use. Ideally, one would want all the symbols to be easily associated with the list (HP_LIST) they are related to.

So why not to put the symbols under the same HP_ prefix? E.g. prefix HP__ZZQX_, to differentiate between the useful and the useless symbols.

N.B. I have worked on a project where one of the shared libraries is already using (internally) zzqx_ prefix, it was always showing up in the application's symbol table at the end. In the race for unlikely-to-be-used names, apparently many people take the same route (end of the latin alphabet) and end up with precisely same names. The opposite of the desired result. That is why I think that namespaces (or in C the symbol prefixes) should not be hidden/burried in the defines, but rather explicitly defined (e.g. easy to find and extract).

And as something concrete, here is your source enhanced with the hack around ## to generate the names using the prefix given as a preprocessor define:

/* the hack is needed to force the LIST_NAME to be expanded. 
   automatically adds underscores. yes, it's ugly */
#define LIST_SYMBOL_1(n1,n2,n3) n1##_##n2##_##n3
#define LIST_SYMBOL_0(n1,n2,n3) LIST_SYMBOL_1(n1,n2,n3)
#define LIST_SYMBOL(pref,name)  LIST_SYMBOL_0(LIST_NAME,pref,name)

/* give the name to the list. used by the LIST_SYMBOL(). */
#define LIST_NAME   HP

/* Give the names and their respective lengths */
#define HP_LIST \
  HP_ITEM(FRED, 4) \
  HP_ITEM(GEORGE, 6) \
  HP_ITEM(HARRY, 5) \
  HP_ITEM(RON, 3) \
  HP_ITEM(HERMIONE, 8) \
  /* BLANK LINE REQUIRED TO ABSORB LAST BACKSLASH */

#define HP_ITEM(name, length) HPID_##name,
typedef enum { HP_LIST HP_ID_COUNT} HP_ID;
#undef HP_ITEM

#define HP_ITEM(name, length)   LIST_SYMBOL(ZZQ,name)}; \
    enum {LIST_SYMBOL(ZZQX,name)=LIST_SYMBOL(ZZQ,name)+(length)-1,
enum { HP_LIST HP_TOTAL_SIZE};
#undef HP_ITEM

#define HP_ITEM(name, length) LIST_SYMBOL(ZZQ,name),
const unsigned char hp_starts[] = { HP_LIST HP_TOTAL_SIZE};
#undef HP_ITEM

#include <stdio.h>

void main(void)
{
  int i;
  printf("ID count=%d Total size=%d\n",HP_ID_COUNT,HP_TOTAL_SIZE);
  for (i=0; i<HP_ID_COUNT ; i++) /* bring the < back, SO is smart enough */
    printf("  %2d=%3d/%3d\n", i, hp_starts[i], hp_starts[i+1]-hp_starts[i]);
  printf("IDs are: \n");
#define HP_ITEM(name, length) printf("  %2d=%s\n",HPID_##name, #name);
  HP_LIST
#undef HP_ITEMS
}

Edit 1. My prefered approach is to put the data into a proper text file, e.g.:

FRED
GEORGE
HARRY
RON
HERMIONE

(note that you do not need length anymore) and write a script (or even a trivial C program) to generate source code from the text file, creating the necessary header (with the enum + declaration of the data) and source file (with the data). Modify the Makefile to run the script before compiling any sources and add the generated source files to the list of compiled sources.

That has HUGE advantage that the generated code is a plain code and can be indexed as such (unless you love the fun of "where that darn id came from?"). The internal constants simply do not appear anymore in the source code since script handles them. And no fugly preprocessor magic anymore.

回复收藏 0 原文