在 C89 中通过查找表进行可移植的用户定义字符类划分,你会这样做吗?

发布于 2024-08-21 00:38:00 字数 201 浏览 10 评论 0原文

static const int class[UCHAR_MAX] =

{ [(unsigned char)'a'] = LOWER, /*macro value classifying the characters*/
  [(unsigned char)'b'] = LOWER,
.
.
.
}

这只是一个想法。这是一件坏事吗?

static const int class[UCHAR_MAX] =

{ [(unsigned char)'a'] = LOWER, /*macro value classifying the characters*/
  [(unsigned char)'b'] = LOWER,
.
.
.
}

This is just an idea. Is it a bad one?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

孤独难免 2024-08-28 00:38:00

指定初始化器位于 C99 中,而不是 C89 中。它们也作为 C89 的 GCC 扩展存在,但不可移植。

除此之外,使用查找表是快速处理少量对象分类的常用方法。

编辑:但有一个更正:数组的大小应该是 UCHAR_MAX+1

Designated initializers are in C99, not C89. They also exist as a GCC extension for C89, but will not be portable.

Other than that, the use of lookup tables is a common way to handle classification of a small number of objects quickly.

Edit: One correction though: The size of the array should be UCHAR_MAX+1

美胚控场 2024-08-28 00:38:00

顺便说一句,GCC 的指定初始化器扩展允许

static const int class[] = {
    [0 ... UCHAR_MAX] = UNCLASSIFIED,
    [(unsigned)'0' ... (unsigned)'9'] = DIGIT,
    [(unsigned)'A' ... (unsigned)'Z'] = UPPER,
    [(unsigned)'a' ... (unsigned)'z'] = LOWER,
 };

初始化器应用于索引范围,后面的初始化会覆盖前面的初始化。

但非常不标准;这不在 C89/C90 或 C99 中。

BTW, GCC's designated initializer extensions allow for

static const int class[] = {
    [0 ... UCHAR_MAX] = UNCLASSIFIED,
    [(unsigned)'0' ... (unsigned)'9'] = DIGIT,
    [(unsigned)'A' ... (unsigned)'Z'] = UPPER,
    [(unsigned)'a' ... (unsigned)'z'] = LOWER,
 };

initializers applying to ranges of indices, with later initializations overriding earlier ones.

Very non-standard, though; this isn't in C89/C90 nor C99.

他是夢罘是命 2024-08-28 00:38:00

不幸的是,这在 C89/90 中不可移植。

$ gcc -std=c89 -pedantic test.c -o test
test.c:4: warning: ISO C90 forbids specifying subobject to initialize
test.c:5: warning: ISO C90 forbids specifying subobject to initialize

Unfortunately, that is not portable in C89/90.

$ gcc -std=c89 -pedantic test.c -o test
test.c:4: warning: ISO C90 forbids specifying subobject to initialize
test.c:5: warning: ISO C90 forbids specifying subobject to initialize
人间不值得 2024-08-28 00:38:00

除了使用 int 而不是 unsigned char 作为类型(从而浪费 768 字节)之外,我认为这是一个非常好的想法/实现。请记住,它依赖于 C99 功能,因此它不适用于旧的 C89/C90 编译器。

另一方面,简单的条件语句应该具有相同的速度并且代码大小要小得多,但它们只能有效地表示某些自然类。

#define is_ascii_letter(x) (((unsigned)(x)|32)-97<26)
#define is_digit(x) ((unsigned)(x)-'0'<10)

ETC。

Aside from using int rather than unsigned char for the type (and thereby wasting 768 bytes), I consider this a very good idea/implementation. Keep in mind that it depends on C99 features, so it won't work with old C89/C90 compilers.

On the other hand, simple conditionals should be the same speed and much smaller in code size, but they can only represent certain natural classes efficiently.

#define is_ascii_letter(x) (((unsigned)(x)|32)-97<26)
#define is_digit(x) ((unsigned)(x)-'0'<10)

etc.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文