在Python中实现C的枚举和联合
我正在尝试找出一些 C 代码,以便将其移植到 python 中。该代码用于读取专有的二进制数据文件格式。到目前为止,它很简单——主要是结构体,我一直在使用 struct 库从文件中请求特定的 ctypes。然而,我刚刚想到了这段代码,但我不知道如何在 python 中实现它。特别是,我不确定如何处理enum
或union
。
#define BYTE char
#define UBYTE unsigned char
#define WORD short
#define UWORD unsigned short
typedef enum {
TEEG_EVENT_TAB1=1,
TEEG_EVENT_TAB2=2
} TEEG_TYPE;
typedef struct
{
TEEG_TYPE Teeg;
long Size;
union
{
void *Ptr; // Memory pointer
long Offset
};
} TEEG;
其次,在下面的结构定义中,我不确定变量名称后面的冒号是什么意思(例如,KeyPad:4
)。这是否意味着我应该读取 4 个字节?
typedef struct
{
UWORD StimType;
UBYTE KeyBoard;
UBYTE KeyPad:4;
UBYTE Accept:4;
long Offset;
} EVENT1;
如果它有用的话,我在 python 中访问文件的方式的抽象示例如下:
from struct import unpack, calcsize def get(ctype, size=1): """Reads and unpacks binary data into the desired ctype.""" if size == 1: size = '' else: size = str(size) chunk = file.read(calcsize(size + ctype)) return unpack(size + ctype, chunk)[0] file = open("file.bin", "rb") file.seek(1234) var1 = get('i') var2 = get('4l') var3 = get('10s')
I'm trying to figure out some C code so that I can port it into python. The code is for reading a proprietary binary data file format. It has been straightforward thus far -- it's mainly been structs and I have been using the struct
library to ask for particular ctypes from the file. However, I just came up on this bit of code and I'm at a loss for how to implement it in python. In particular, I'm not sure how to deal with the enum
or the union
.
#define BYTE char
#define UBYTE unsigned char
#define WORD short
#define UWORD unsigned short
typedef enum {
TEEG_EVENT_TAB1=1,
TEEG_EVENT_TAB2=2
} TEEG_TYPE;
typedef struct
{
TEEG_TYPE Teeg;
long Size;
union
{
void *Ptr; // Memory pointer
long Offset
};
} TEEG;
Secondly, in the below struct definition, I'm not sure what the colons after the variable names mean, (e.g., KeyPad:4
). Does it mean I'm supposed to read 4 bytes?
typedef struct
{
UWORD StimType;
UBYTE KeyBoard;
UBYTE KeyPad:4;
UBYTE Accept:4;
long Offset;
} EVENT1;
In case it's useful, an abstract example of the way I've been accessing the file in python is as follows:
from struct import unpack, calcsize def get(ctype, size=1): """Reads and unpacks binary data into the desired ctype.""" if size == 1: size = '' else: size = str(size) chunk = file.read(calcsize(size + ctype)) return unpack(size + ctype, chunk)[0] file = open("file.bin", "rb") file.seek(1234) var1 = get('i') var2 = get('4l') var3 = get('10s')
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
枚举:该语言中没有枚举。人们提出了各种习语,但没有一个真正广泛传播。最直接(在这种情况下足够)的解决方案是
联合: ctypes 有 < a href="http://docs.python.org/py3k/library/ctypes.html#structurals-and-unions" rel="noreferrer">联合。
fieldname : n
语法称为位域,是的,确实意味着“这是 n 位大”。同样,ctypes 有它们 。Enums: There are no enums in the language. Various idioms have been proposed, but none is really widespread. The most straightforward (and in this case sufficient) solution is
Unions: ctypes has unions.
The
fieldname : n
syntax is called a bitfield and, yeah, does mean "this is n bits big". Again, ctypes has them.我不知道你所有问题的答案,但对于不需要按值查找的枚举(只是用它来避免幻数),我喜欢使用一个小类。常规字典是另一种效果很好的选择。如果您需要按值查找,则可能需要另一个结构。
I don't know the answer to all of your question, but for enums that you do not need a lookup-by-value on, (is, just using it to avoid magic numbers), I like to use a small class. A regular dict is another option that works fine. If you need lookup-by-value, you may want another structure though.
您真正需要知道的是:
ctypes
模块。对于您正在做的事情,它可能比 struct 模块更容易使用。特别是,它可以与通过 C 到达的指针一起使用。ctypes
;该模块具有执行必要的转换的功能。What you really need to know is:
ctypes
module. For what you are doing, it may be easier to work with than thestruct
module. In particular, it can work with pointers arriving via C.ctypes
in the bullet above; this module has functions for performing the necessary casts.C
enum
声明是某种整数类型的语法包装器。请参阅sizeof(enum) == sizeof(int) 总是吗?。int
有多大取决于特定的 C 编译器。我可能会从尝试 16 位开始。union
保留一块内存块,其大小为所包含的最大数据类型。同样,确切的大小将取决于 C 实现,但我希望 32 位架构为 32 位,如果编译为本机 64 位代码,则为 64 位。一般来说,您可以将 union 的内容存储在 Python 整数或长整型中,无论其中保存的是指针还是偏移量。一个更有趣的问题是为什么指针会被写入磁盘文件。您可能会发现,当
TEEG
struct
位于内存中时,union
字段仅被视为指针,但当写入磁盘时,它是始终是整数偏移量。至于 :4 表示法,正如一些人所指出的,它们是“位字段”,意思是一系列位,其中几个可以打包到一个空间中。如果我没记错的话,C 中的位字段被打包到 int 中,因此这两个 4 位字段都将被打包到一个整数中。可以通过适当使用 Python 的“&”来解压它们。 (按位与)和“>>” (右移)运算符。同样,字段究竟如何打包到整数中,以及整数字段本身的大小将取决于特定的 C 实现。
也许下面的代码片段会对您有所帮助:
The C
enum
declaration is a syntactic wrapper around some integer type. See Is the sizeof(enum) == sizeof(int), always?. How big anint
is will depend on the particular C compiler. I would probably start by trying 16 bits.The
union
reserves a block of memory the size of the largest of the contained data types. Again, the exact size will depend on the C implementation, but I would expect 32 bits for a 32-bit architecture, or 64-bits if this is compiled as native 64-bit code. Generally speaking, you will be able to store the contents of the union in a Python integer or long, regardless of whether what has been saved in it is a pointer or an offset.A more interesting question is why a pointer would ever be written to a disk file. You may find that the
union
field is only treated as a pointer when theTEEG
struct
is in memory, but when written to disk, it is always an integer offset.As for the :4 notation, as several people have noted, these are "bit fields," meaning a sequence of bits, several of which can be packed into a single space. If I recall correctly, bitfields in C are packed into
int
s, so both of these 4-bit fields will be packed into a single integer. They can be unpacked with appropriate use of Python's "&" (bitwise and) and ">>" (right shift) operators. Again, exactly how the fields have been packed into the integer, and the size of the integer field itself, will depend on the particular C implementation.Maybe the following code snippet will help you: