Haskell 中 FFI 的 Union 和 Type**?

发布于 2024-11-17 18:38:43 字数 1011 浏览 6 评论 0原文

我需要知道如何与 FFI 解析 Unions 和 Type**(例如 int**)? 我知道我需要一个结构的可存储实例,我也可以将它用于联合吗?

像这样的联合:

typedef union {
     int i;
     char c;
} my_union;

这通常在 Haskell 中表示为:

data MyUnion = I CInt | C CChar

我的问题是你将如何编组(定义一个 Storable 实例) myUnion 变成 my_union?据我了解,实例 my_union 会占用内存中的 sizeof(int) 字节,即它最大的大小 成员。因此,为了存储它,我们将编写如下内容:

instance Storable myUnion where
     size _ = #{size my_union} -- <-- hsc2hs shortcut
     alignment _ = alignment undefined::CInt -- <-- What should this really be?
     peek ptr = do -- <-- How are you supposed to know which element to extract?
     poke ptr (I i) =  poke ptr i -- <-- Or should this be #{poke my_union, i} ptr i ?
     poke ptr (C c) = poke ptr c

另外,如何用 FFI 表示 int** ? 当我得到像 int foo(int i1, int* i2); 这样的函数时 签名将是: foo -> CInt->指针 CInt -> CInt

但是如果有怎么办:int foo(int i1, int** i2);

i need to know how can i resolve Unions and Type** (e.g. int**) with the FFI?
I know that i need a Storable instance for structs, could i use it for unions too?

a union like so:

typedef union {
     int i;
     char c;
} my_union;

This would typically be represented in Haskell as:

data MyUnion = I CInt | C CChar

My question is how would you marshall (define an Storable instance for)
myUnion into my_union? It's my understanding that an instance my_union
would take up sizeof(int) bytes in memory, i.e. the size of it's largest
member. So to store this we would write something along the lines of:

instance Storable myUnion where
     size _ = #{size my_union} -- <-- hsc2hs shortcut
     alignment _ = alignment undefined::CInt -- <-- What should this really be?
     peek ptr = do -- <-- How are you supposed to know which element to extract?
     poke ptr (I i) =  poke ptr i -- <-- Or should this be #{poke my_union, i} ptr i ?
     poke ptr (C c) = poke ptr c

Also, how can you represent a int** with the FFI?
when i got a function like int foo(int i1, int* i2);
the signature would be: foo -> CInt -> Ptr CInt -> CInt

but what if there is: int foo(int i1, int** i2);

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

薆情海 2024-11-24 18:38:43

即使在 C 中,如果您收到以下内容,您也不知道要使用哪个成员(除非从上下文中清楚了解):

typedef union {
     int i;
     char c;
} my_union;

C 解决方案是添加一个携带该类型的额外成员。

typedef struct {
     int type;
     union {
          int i;
          char c;
     } my_union;
} my_tagged_union;

Even in C you wouldn't know which member to use (unless clear from the context) if you were handed a:

typedef union {
     int i;
     char c;
} my_union;

The C solution is to add an extra member that carries the type.

typedef struct {
     int type;
     union {
          int i;
          char c;
     } my_union;
} my_tagged_union;
芯好空 2024-11-24 18:38:43

C 联合不是标记联合,请参阅有关此内容的维基百科。在 haskell 中,MyUnion 会比单个原始(未装箱)64 位 int 占用更多内存。在 GHC 中,它是一个指向 thunk 或值的特殊指针:thunk 是指尚未对惰性 MyUnion 求值时的值,该值是指它已被求值时的值,并且指向的内存大小可以变化(与 union 不同)在C)中。 “特殊”指针将使用 64 位指针的通常为零的低位来指示它是否已知是 C 或 I 值,以将标记与指针组合起来。

来进行不那么懒惰的声明。

data MyUnion1 = I !Int | C !Char
data MyUnion2 = I {-# UNPACK #-} !Int | C {-# UNPACK #-} !Char

在 Haskell 中,可以使用Where the "!" 指示该值永远不会存储为未评估的 thunk。 UNPACK 编译器 pragma 注释要求 GHC 将原始拆箱值与标记一起存储,而不是存储指向 Int 或 Char 的指针。所以MyUnion2可能会占用更少的内存,并且会是严格的而不是惰性的。

另外,我应该强调 C 中的“char”是单个有符号字节,而 Haskell 中的“Char”是完整的 unicode 代码点(值 0 到 1114111)。要在 Haskell 中存储 C“char,您可以使用 CChar

您在 C 中使用了联合,需要序列化和反序列化你已经有 C 使用的二进制格式了吗?如果你需要发明一个二进制格式,那么你确实需要设计一个标签来让你的 C 示例无法判断该值是否是“构造的”。用 int 或 char,而 Haskell 中的 MyUnion 可以判断该值是由 I 还是 C 构造的。

你写的 C 类型也相当危险,就好像我写入单字节“char”并读取多字节一样-字节“int” “int”中的其余字节可能未定义。

C unions are not tagged unions, see wikipedia on this. In haskell MyUnion would take up more memory than a single raw (unboxed) 64-bit int. In GHC it would be a special pointer to either a thunk or a value: the thunk is when a lazy MyUnion has not been evaluated yet, the value is for when it has been evaluated and the pointed-to memory size can vary (unlike unions in C). The "special" pointer would use the usually-zero low bits of the 64-bit pointer to indicate whether it is known to be a C or I value, to combine the tagging with the pointer.

A less-lazy declaration in Haskell can be made with

data MyUnion1 = I !Int | C !Char
data MyUnion2 = I {-# UNPACK #-} !Int | C {-# UNPACK #-} !Char

Where the "!" indicates the value is never stored as an unevaluated thunk. The UNPACK compiler pragma comment asks GHC to store the raw unboxed value alongside the tag instead of storing a pointer to the Int or Char. So MyUnion2 may take up less memory and will be strict instead of lazy.

Also, I should emphasize "char" from C is a single signed byte while "Char" in Haskell is a full unicode code point (value 0 to 1114111). To store a C "char in Haskell you would use a CChar.

You have unions in use in C and need to serialize and desearialize them? Do you already have a binary format in use by C ? If you need to invent a binary format then you do need to design a tag to make Haskell happy. Your C example has no way to tell whether the value was "constructed" with an int or a char, while MyUnion in Haskell can tell whether the value was constructed by an I or a C.

The C type you wrote is also quite dangerous, as if I write to the single byte "char" and read the multi-byte "int" the rest of the bytes in the "int" are likely undefined.

伊面 2024-11-24 18:38:43

您可以轻松获得指向指针的指针(我使用类似的方法将 (void*)&val 参数传递给 C 库)。在 ghci 上:

> a <- malloc :: (IO (Ptr Int))
> dir_a <- malloc :: (IO (Ptr (Ptr Int)))
> poke dir_a a
> poke a 5

> b <- peek dir_a
> peek b
5

Yo can get Pointers to Pointer easily (I use something similar to pass a (void*)&val parameter to a C library). On ghci:

> a <- malloc :: (IO (Ptr Int))
> dir_a <- malloc :: (IO (Ptr (Ptr Int)))
> poke dir_a a
> poke a 5

> b <- peek dir_a
> peek b
5
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文