关于typedef中单实例数组的一些问题

发布于 2024-10-15 09:15:12 字数 959 浏览 8 评论 0原文

我正在使用 GNU 多精度 (GMP) 库代码仔细阅读一些使用任意长度整数的代码。 MP 整数的类型是 gmp.h 头文件中定义的 mpz_t 。

但是，我对这个库定义的 mpz_t 类型的较低级别定义有一些疑问。头代码中：

/* THIS IS FROM THE GNU MP LIBRARY gmp.h HEADER FILE */
typedef struct
{
    /* SOME OTHER STUFF HERE */
} __mpz_struct;

typedef __mpz_struct mpz_t[1];

第一个问题：[1]是否与__mpz_struct关联？换句话说，typedef 是否将 mpz_t 类型定义为仅出现一次的 __mpz_struct 数组？

第二个问题：为什么是数组？（为什么只出现一次？）这是我听说过的结构黑客之一吗？

第三个问题（可能与第二个问题间接相关）：mpz_init_set(mpz_t, unsigned long int) 函数的 GMP 文档说仅将其用作按值传递，尽管人们会假设该函数将在被调用函数中修改其内容（因此需要传递引用）语法。请参阅我的代码：

/* FROM MY CODE */
mpz_t fact_val;                /* declaration */
mpz_init_set_ui(fact_val, 1);  /* Initialize fact_val */

单次出现的数组是否自动启用引用传递（由于 C 中数组/指针语义的崩溃）？我坦率地承认我有点过度分析了这一点，但我当然喜欢对此进行任何讨论。谢谢！

原文

I was perusing some code using arbitrary-length integers using the GNU Multi-Precision (GMP) library code. The type for a MP integer is mpz_t as defined in gmp.h header file.

But, I've some questions about the lower-level definition of this library-defined mpz_t type. In the header code:

/* THIS IS FROM THE GNU MP LIBRARY gmp.h HEADER FILE */
typedef struct
{
    /* SOME OTHER STUFF HERE */
} __mpz_struct;

typedef __mpz_struct mpz_t[1];

First question: Does the [1] associate with the __mpz_struct? In other words, is the typedef defining a mpz_t type as a __mpz_struct array with one occurrence?

Second question: Why the array? (And why only one occurrence?) Is this one of those struct hacks I've heard about?

Third question (perhaps indirectly related to second question): The GMP documentation for the mpz_init_set(mpz_t, unsigned long int) function says to use it as pass-by-value only, although one would assume that this function would be modifying its contents within the called function (and thus would need pass-by-reference) syntax. Refer to my code:

/* FROM MY CODE */
mpz_t fact_val;                /* declaration */
mpz_init_set_ui(fact_val, 1);  /* Initialize fact_val */

Does the single-occurrence array enable pass-by-reference automatically (due to the breakdown of array/pointer semantics in C)? I freely admit I'm kinda over-analyzing this, but I'd certainly love any discussion on this. Thanks!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

铜锣湾横着走 2024-10-22 09:15:12

这似乎不是 C2 上描述的意义上的结构黑客。看来他们希望 mpz_t 具有指针语义（大概，他们希望人们像不透明指针一样使用它）。考虑以下代码片段之间的语法差异：

struct __mpz_struct data[1];

(&data[0])->access = 1;
gmp_func(data, ...);

由于 C 数组会衰减为指针，因此

mpz_t data;

data->access = 1;
gmp_func(data, ...);

这还允许自动通过引用传递 mpz_t 类型。

它还允许您使用类似指针的类型，而无需malloc或free它。

This does not appear to be a struct hack in the sense described on C2. It appears that they want mpz_t to have pointer semantics (presumably, they want people to use it like an opaque pointer). Consider the syntactic difference between the following snippets:

struct __mpz_struct data[1];

(&data[0])->access = 1;
gmp_func(data, ...);

And

mpz_t data;

data->access = 1;
gmp_func(data, ...);

Because C arrays decay into pointers, this also allows for automatic pass by reference for the mpz_t type.

It also allows you to use a pointer-like type without needing to malloc or free it.

回复收藏 0 原文

作业与我同在 2024-10-22 09:15:12

*^{第一个问题：[1] 是否与 __mpz_struct 关联？换句话说，typedef 是否将 mpz_t 类型定义为仅出现一次的 __mpz_struct 数组？}*

是的。

^{第二个问题：为什么是数组？（为什么只出现一次？）这是我听说过的结构黑客之一吗？}

打败了我。不知道，但一种可能性是作者想要创建一个自动通过引用传递的对象，或者“是”，可能是结构黑客。如果您曾经看到 mpz_t 对象作为结构的最后一个成员，那么“几乎可以肯定”这是结构黑客。看起来像这样的分配

malloc(sizeof(struct whatever) + sizeof(mpz_t) * some_number)`

将是一个致命的赠品。

^{单次出现的数组是否自动启用引用传递...？}

啊哈，你也明白了。 “是”，一个可能的原因是以更复杂的引用为代价来简化引用传递。

我想另一种可能性是数据模型或算法发生了变化，作者想找到每个参考并以某种方式改变它。像这样的类型更改将使程序具有相同的基本类型，但会错误输出每个未转换的引用。

*^{First question: Does the [1] associate with the __mpz_struct? In other words, is the typedef defining a mpz_t type as a __mpz_struct array with one occurrence?}*

Yes.

^{Second question: Why the array? (And why only one occurrence?) Is this one of those struct hacks I've heard about?}

Beats me. Don't know, but one possibility is that the author wanted to make an object that was passed by reference automatically, or, "yes", possibly the struct hack. If you ever see an mpz_t object as the last member of a struct, then "almost certainly" it's the struct hack. An allocation looking like

malloc(sizeof(struct whatever) + sizeof(mpz_t) * some_number)`

would be a dead giveaway.

^{Does the single-occurrence array enable pass-by-reference automatically...?}

Aha, you figured it out too. "Yes", one possible reason is to simplify pass-by-reference at the expense of more complex references.

I suppose another possibility is that something changed in the data model or the algorithm, and the author wanted to find every reference and change it in some way. A change in type like this would leave the program with the same base type but error-out every unconverted reference.

回复收藏 0 原文

千年*琉璃梦 2024-10-22 09:15:12

其原因来自于mpn的实现。具体来说，如果您有数学倾向，您会发现 N 是自然数集合 (1,2,3,4...)，而 Z 是整数集合 (...,-2,-1,0 ,1,2,...)。

为Z 实现bignum 库相当于为N 实现bignum 库，并考虑到符号运算的一些特殊规则，即跟踪是否需要进行加法或减法以及结果是什么。

现在，至于 bignum 库是如何实现的......这里有一行代码可以为您提供线索：

typedef unsigned int        mp_limb_t;
typedef mp_limb_t *     mp_ptr;

现在让我们看一下对其进行操作的函数签名：

__GMP_DECLSPEC mp_limb_t mpn_add __GMP_PROTO ((mp_ptr, mp_srcptr, mp_size_t, mp_srcptr,mp_size_t));

基本上，它归结为“limb”是一个整数字段表示数字的位，整个数字表示为一个巨大的数组。聪明的部分是 gmp 以非常高效、优化的方式完成这一切。

无论如何，回到讨论。基本上，如您所知，在 C 中传递数组的唯一方法是传递指向这些数组的指针，这可以有效地实现按引用传递。现在，为了跟踪发生的情况，定义了两种类型，一个 mp_ptr ，它是一个足以存储您的数字的 mp_limb_t 数组，以及 mp_srcptr 这是它的 const 版本，这样您就不会意外地更改您正在操作的源 bignums 的位。基本思想是大多数函数都遵循这种模式：

func(ptr output, src in1, src in2)

等等。因此，我怀疑 mpz_* 函数遵循这种约定只是为了保持一致，因为这就是作者的想法。

简短版本：由于您必须实现 bignum lib，所以这是必要的。

The reason for this comes from the implementation of mpn. Specifically, if you're mathematically inclined you'll realise N is the set of natural numbers (1,2,3,4...) whereas Z is the set of integers (...,-2,-1,0,1,2,...).

Implementing a bignum library for Z is equivalent to doing so for N and taking into account some special rules for sign operations, i.e. keeping track of whether you need to do an addition or a subtraction and what the result is.

Now, as for how a bignum library is implemented... here's a line to give you a clue:

typedef unsigned int        mp_limb_t;
typedef mp_limb_t *     mp_ptr;

And now let's look at a function signature operating on that:

__GMP_DECLSPEC mp_limb_t mpn_add __GMP_PROTO ((mp_ptr, mp_srcptr, mp_size_t, mp_srcptr,mp_size_t));

Basically, what it comes down to is that a "limb" is an integer field representing the bits of a number and the whole number is represented as a huge array. The clever part is that gmp does all this in a very efficient, well optimised manner.

Anyway, back to the discussion. Basically, the only way to pass arrays around in C is, as you know, to pass pointers to those arrays which effectively enables pass by reference. Now, in order to keep track of what's going on, two types are defined, a mp_ptr which is an array of mp_limb_t big enough to store your number, and mp_srcptr which is a const version of that, so that you cannot accidentally alter the bits of the source bignums on what you are operating. The basic idea is that most of the functions follow this pattern: