强制将位字段读取为 32 位

发布于 2024-08-13 02:28:13 字数 1634 浏览 11 评论 0原文

我正在尝试通过 PCI 总线对 VME 桥接芯片（Tundra Universe II）执行小于 32 位的读取，然后该读取将进入 VME 总线并由目标拾取。

目标VME应用程序仅接受D32（32位的数据宽度读取）并且将忽略其他任何内容。

如果我使用映射到 VME 窗口的位字段结构（nmap 到主内存中），我可以读取大于 24 位的位字段，但任何不足的都会失败。即：-

struct works {
    unsigned int a:24;
};

struct fails {
    unsigned int a:1;
    unsigned int b:1;
    unsigned int c:1;
};

struct main {
    works work;
    fails fail;
}
volatile *reg = function_that_creates_and_maps_the_vme_windows_returns_address()

这表明 struct Works 被读取为 32 位，但是通过 a 的结构读取 a 例如 reg ->fail.a 被分解为 X 位读取。（其中 X 可能是 16 或 8？）

所以问题是：
a) 哪里缩小了？编译器？操作系统？还是 Tundra 芯片？
b) 执行的读操作的实际大小是多少？

我基本上想排除一切，除了芯片。网上有相关文档，但如果可以证明 PCI 总线上请求的数据宽度是 32 位，那么问题就可以归咎于 Tundra 芯片！

编辑：-
具体示例，代码是：-

struct SVersion
{
    unsigned title         : 8;
    unsigned pecversion    : 8;
    unsigned majorversion  : 8;
    unsigned minorversion  : 8;
} Version;

所以现在我已将其更改为：-

union UPECVersion
{
    struct SVersion
    {
        unsigned title         : 8;
        unsigned pecversion    : 8;
        unsigned majorversion  : 8;
        unsigned minorversion  : 8;
    } Version;
    unsigned int dummy;
};

以及基本主结构：-

typedef struct SEPUMap
{
    ...
    ...
    UPECVersion PECVersion;

};

所以我仍然必须更改所有基线代码

// perform dummy 32bit read
pEpuMap->PECVersion.dummy;

// get the bits out
x = pEpuMap->PECVersion.Version.minorversion;

我如何知道第二次读取是否实际上不会再次进行真正的读取，就像我原来的代码一样？（而不是通过联合使用已经读取的位！）

原文

I am trying to perform a less-than-32bit read over the PCI bus to a VME-bridge chip (Tundra Universe II), which will then go onto the VME bus and picked up by the target.

The target VME application only accepts D32 (a data width read of 32bits) and will ignore anything else.

If I use bit field structure mapped over a VME window (nmap'd into main memory) I CAN read bit fields >24 bits, but anything less fails. ie :-

struct works {
    unsigned int a:24;
};

struct fails {
    unsigned int a:1;
    unsigned int b:1;
    unsigned int c:1;
};

struct main {
    works work;
    fails fail;
}
volatile *reg = function_that_creates_and_maps_the_vme_windows_returns_address()

This shows that the struct works is read as a 32bit, but a read via fails struct of a for eg reg->fail.a is getting factored down to a X bit read. (where X might be 16 or 8?)

So the questions are :
a) Where is this scaled down? Compiler? OS? or the Tundra chip?
b) What is the actual size of the read operation performed?

I basiclly want to rule out everything but the chip. Documentation on that is on the web, but if it can be proved that the data width requested over the PCI bus is 32bits then the problem can be blamed on the Tundra chip!

edit:-
Concrete example, code was:-

struct SVersion
{
    unsigned title         : 8;
    unsigned pecversion    : 8;
    unsigned majorversion  : 8;
    unsigned minorversion  : 8;
} Version;

So now I have changed it to this :-

union UPECVersion
{
    struct SVersion
    {
        unsigned title         : 8;
        unsigned pecversion    : 8;
        unsigned majorversion  : 8;
        unsigned minorversion  : 8;
    } Version;
    unsigned int dummy;
};

And the base main struct :-

typedef struct SEPUMap
{
    ...
    ...
    UPECVersion PECVersion;

};

So I still have to change all my baseline code

// perform dummy 32bit read
pEpuMap->PECVersion.dummy;

// get the bits out
x = pEpuMap->PECVersion.Version.minorversion;

And how do I know if the second read wont actually do a real read again, as my original code did? (Instead of using the already read bits via the union!)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

爱你是孤单的心事 2024-08-20 02:28:14

结构体的大小不等于其字段（包括位字段）大小的总和。C 和 C++ 语言规范允许编译器插入struct 中字段之间的填充。通常插入填充是为了对齐目的。

嵌入式系统编程中的常见方法是将数据读取为无符号整数，然后使用位掩码来检索感兴趣的位。这是由于我所说的上述规则以及结构中没有用于“打包”字段的标准编译器参数这一事实。

我建议创建一个对象（class 或 struct）用于与硬件连接。让对象读取数据，然后将这些位提取为 bool 成员。这使得实现尽可能接近硬件。其余软件不应该关心如何这些位是如何实现的。

当定义位域位置/命名常量时，我建议使用这种格式：

#define VALUE (1 << 位位置) // 或 const 无符号整型 VALUE = 1 << 位位置；

这种格式更具可读性，并且可以让编译器执行算术运算。计算在编译期间进行，对运行时没有影响。

回复收藏 0 原文

难如初 2024-08-20 02:28:14

例如，Linux 内核具有显式处理内存映射 IO 读写的内联函数。在较新的内核中，它是一个大的宏包装器，可归结为内联汇编 movl 指令，但在较旧的内核中，它的定义如下：

#define readl(addr) (*(volatile unsigned int *) (addr))
#define writel(b,addr) ((*(volatile unsigned int *) (addr)) = (b))

As an example, the Linux kernel has inline functions that explicitly handle memory-mapped IO reads and writes. In newer kernels it's a big macro wrapper that boils down to an inline assembly movl instruction, but it older kernels it was defined like this:

#define readl(addr) (*(volatile unsigned int *) (addr))
#define writel(b,addr) ((*(volatile unsigned int *) (addr)) = (b))

回复收藏 0 原文

紫南 2024-08-20 02:28:14

Ian - 如果你想确定你正在读/写的东西的大小，我建议不要使用这样的结构来做到这一点 - 失败结构的 sizeof 可能只是 1 个字节 - 编译器是免费的来决定它应该基于优化等 - 我建议使用 int 显式地读取/写入或通常需要确保其大小的东西，然后执行其他操作，例如转换为您没有的联合/结构这些限制。

回复收藏 0 原文

送君千里 2024-08-20 02:28:14

编译器决定要发出的读取大小。要强制执行 32 位读取，您可以使用 union：

union dev_word {
    struct dev_reg {
        unsigned int a:1;
        unsigned int b:1;
        unsigned int c:1;
    } fail;
    uint32_t dummy;
};

volatile union dev_word *vme_map_window();

如果通过易失性限定指针读取联合不足以强制读取整个联合（我认为会是 -但这可能取决于编译器），那么您可以使用一个函数来提供所需的间接：

volatile union dev_word *real_reg; /* Initialised with vme_map_window() */

union dev_word * const *reg_func(void)
{
    static union dev_word local_copy;
    static union dev_word * const static_ptr = &local_copy;

    local_copy = *real_reg;
    return &static_ptr;
}

#define reg (*reg_func())

...然后（为了与现有代码兼容）您的访问完成如下：

reg->fail.a

It is the compiler that decides what size read to issue. To force a 32 bit read, you could use a union:

union dev_word {
    struct dev_reg {
        unsigned int a:1;
        unsigned int b:1;
        unsigned int c:1;
    } fail;
    uint32_t dummy;
};

volatile union dev_word *vme_map_window();

If reading the union through a volatile-qualified pointer isn't enough to force a read of the whole union (I would think it would be - but that could be compiler-dependent), then you could use a function to provide the required indirection:

volatile union dev_word *real_reg; /* Initialised with vme_map_window() */

union dev_word * const *reg_func(void)
{
    static union dev_word local_copy;
    static union dev_word * const static_ptr = &local_copy;

    local_copy = *real_reg;
    return &static_ptr;
}

#define reg (*reg_func())

...then (for compatibility with the existing code) your accesses are done as:

reg->fail.a

回复收藏 0 原文

阳光①夏 2024-08-20 02:28:14

前面描述的使用 gcc 标志 -fstrict-volatile-bitfields 并将位域变量定义为 volatile u32 的方法可以工作，但定义的总位数必须大于 16。

例如：

typedef     union{
    vu32    Word;
    struct{
        vu32    LATENCY     :3;
        vu32    HLFCYA      :1;
        vu32    PRFTBE      :1;
        vu32    PRFTBS      :1;  
    };
}tFlashACR;
.
tFLASH* const pFLASH    =   (tFLASH*)FLASH_BASE;
#define FLASH_LATENCY       pFLASH->ACR.LATENCY
.
FLASH_LATENCY = Latency;

使 gcc 生成

.
ldrb r1, [r3, #0]
.

字节读取的代码。但是，更改 typedef 将

typedef     union{
    vu32    Word;
    struct{
        vu32    LATENCY     :3;
        vu32    HLFCYA      :1;
        vu32    PRFTBE      :1;
        vu32    PRFTBS      :1;
        vu32                :2;

        vu32    DUMMY1      :8;

        vu32    DUMMY2      :8;
    };
}tFlashACR;

结果代码更改为

.
ldr r3, [r2, #0]
.

The method described earlier of using the gcc flag -fstrict-volatile-bitfields and defining bitfield variables as volatile u32 works, but the total number of bits defined must be greater than 16.

For example:

typedef     union{
    vu32    Word;
    struct{
        vu32    LATENCY     :3;
        vu32    HLFCYA      :1;
        vu32    PRFTBE      :1;
        vu32    PRFTBS      :1;  
    };
}tFlashACR;
.
tFLASH* const pFLASH    =   (tFLASH*)FLASH_BASE;
#define FLASH_LATENCY       pFLASH->ACR.LATENCY
.
FLASH_LATENCY = Latency;

causes gcc to generate code

.
ldrb r1, [r3, #0]
.

which is a byte read. However, changing the typedef to

typedef     union{
    vu32    Word;
    struct{
        vu32    LATENCY     :3;
        vu32    HLFCYA      :1;
        vu32    PRFTBE      :1;
        vu32    PRFTBS      :1;
        vu32                :2;

        vu32    DUMMY1      :8;

        vu32    DUMMY2      :8;
    };
}tFlashACR;

changes the resultant code to

.
ldr r3, [r2, #0]
.

回复收藏 0 原文

战皆罪 2024-08-20 02:28:14

我相信唯一的解决办法是
~~1) 编辑/创建我的主结构作为所有 32 位整数（无符号长整型）~~
2）保留我原来的位域结构
3）我需要的每个访问权限，
3.1) 我必须将结构体成员读取为 32 位字，并将其转换为位域结构体，
3.2）读取我需要的位域元素。（对于写入，设置此位字段，然后写回该字！）

~~(1) 这是相同的，因为这样我就丢失了“main/SEPUMap”结构的每个成员的内在类型。< /s>~~

结束解决方案：-
而不是：-

printf("FirmwareVersionMinor: 0x%x\n", pEpuMap->PECVersion);

这个：-

SPECVersion ver = *(SPECVersion*)&pEpuMap->PECVersion;

printf("FirmwareVersionMinor: 0x%x\n", ver.minorversion);

我唯一的问题是写作！（写入现在是读取/修改/写入！）

// Read - Get current
_HVPSUControl temp = *(_HVPSUControl*)&pEpuMap->HVPSUControl;

// Modify - set to new value
temp.OperationalRequestPort = true;

// Write
volatile unsigned int *addr = reinterpret_cast<volatile unsigned int*>(&pEpuMap->HVPSUControl);

*addr = *reinterpret_cast<volatile unsigned int*>(&temp);

~~只需将该代码整理到一个方法中即可！~~

#define writel(addr, data) ( *(volatile unsigned long*)(&addr) = (*(volatile unsigned long*)(&data)) )

I believe the only solution is to
~~1) edit/create my main struct as all 32bit ints (unsigned longs)~~
2) keep my original bit-field structs
3) each access I require,
3.1) I have to read the struct member as a 32bit word, and cast it into the bit-field struct,
3.2) read the bit-field element I require. (and for writes, set this bit-field, and write the word back!)

~~(1) Which is a same, because then I lose the intrinsic types that each member of the "main/SEPUMap" struct are.~~

End solution :-
Instead of :-

printf("FirmwareVersionMinor: 0x%x\n", pEpuMap->PECVersion);

This :-

SPECVersion ver = *(SPECVersion*)&pEpuMap->PECVersion;

printf("FirmwareVersionMinor: 0x%x\n", ver.minorversion);

Only problem I have is writting! (Writes are now Read/Modify/Writes!)

// Read - Get current
_HVPSUControl temp = *(_HVPSUControl*)&pEpuMap->HVPSUControl;

// Modify - set to new value
temp.OperationalRequestPort = true;

// Write
volatile unsigned int *addr = reinterpret_cast<volatile unsigned int*>(&pEpuMap->HVPSUControl);

*addr = *reinterpret_cast<volatile unsigned int*>(&temp);

~~Just have to tidy that code up into a method!~~

#define writel(addr, data) ( *(volatile unsigned long*)(&addr) = (*(volatile unsigned long*)(&data)) )

回复收藏 0 原文

飞烟轻若梦 2024-08-20 02:28:14

我在使用 GCC 编译器的 ARM 上遇到了同样的问题，其中仅通过字节而不是 32 位字写入内存。

解决方案是使用易失性uint32_t（或所需的写入大小）定义位字段：

union {
    volatile uint32_t XY;
    struct {
        volatile uint32_t XY_A : 4;
        volatile uint32_t XY_B : 12;
    };
};

但在编译时，您需要向gcc或g++添加此参数：

-fstrict-volatile-bitfields

更多信息请参阅gcc文档。

I had same problem on ARM using GCC compiler, where write into memory is only through bytes rather than 32bit word.

The solution is to define bit-fields using volatile uint32_t (or required size to write):

union {
    volatile uint32_t XY;
    struct {
        volatile uint32_t XY_A : 4;
        volatile uint32_t XY_B : 12;
    };
};

but while compiling you need add to gcc or g++ this parameter:

-fstrict-volatile-bitfields

more in gcc documentation.

回复收藏 0 原文

时间你老了 2024-08-20 02:28:13

您的编译器正在将结构的大小调整为其内存对齐设置的倍数。几乎所有现代编译器都这样做。在某些处理器上，变量和指令必须从某些内存对齐值（通常是 32 位或 64 位，但对齐取决于处理器架构）的倍数的内存地址开始。大多数现代处理器不再需要内存对齐 - 但几乎所有处理器都从中看到了实质性性能优势。因此，编译器会为您调整数据以提高性能。

但是，在许多情况下（例如您的情况），这不是您想要的行为。由于各种原因，结构的大小可能变得极其重要。在这些情况下，有多种方法可以解决该问题。

一种选择是强制编译器使用不同的对齐设置。执行此操作的选项因编译器而异，因此您必须检查文档。它通常是某种#pragma。在某些编译器（例如 Microsoft 编译器）上，可以仅更改一小部分代码的内存对齐方式。例如（在 VC++ 中）：

#pragma pack(push)      // save the current alignment
#pragma pack(1)         // set the alignment to one byte
// Define variables that are alignment sensitive
#pragma pack(pop)       // restore the alignment

另一个选择是以其他方式定义变量。内部类型不会根据对齐方式调整大小，因此另一种方法是将变量定义为字节数组，而不是 24 位位字段。

最后，您可以让编译器将结构体设置为它们想要的任何大小，并手动记录您需要读/写的大小。只要您不将结构连接在一起，这应该可以正常工作。但请记住，编译器在底层为您提供了填充结构，因此，如果您创建一个更大的结构，其中包括一个 works 和一个 fails 结构，那么它们之间会填充一些可能会给您带来问题的位。

在大多数编译器上，创建小于 8 位的数据类型几乎是不可能的。大多数架构并不这么想。这不应该是一个大问题，因为大多数使用小于 8 位数据类型的硬件设备最终都会以仍然是 8 位倍数的方式排列数据包，因此您可以进行位操作来提取或在数据流离开或进入时对数据流上的值进行编码。

由于上面列出的所有原因，许多与此类硬件设备一起使用的代码都使用原始字节数组，并且仅对数组中的数据进行编码。尽管失去了现代语言结构的许多便利，但它最终变得更容易。

Your compiler is adjusting the size of your struct to a multiple of its memory alignment setting. Almost all modern compilers do this. On some processors, variables and instructions have to begin on memory addresses that are multiples of some memory alignment value (often 32-bits or 64-bits, but the alignment depends on the processor architecture). Most modern processors don't require memory alignment anymore - but almost all of them see substantial performance benefit from it. So the compilers align your data for you for the performance boost.

However, in many cases (such as yours) this isn't the behavior you want. The size of your structure, for various reasons, can turn out to be extremely important. In those cases, there are various ways around the problem.

One option is to force the compiler to use different alignment settings. The options for doing this vary from compiler to compiler, so you'll have to check your documentation. It's usually a #pragma of some sort. On some compilers (the Microsoft compilers, for instance) it's possible to change the memory alignment for only a very small section of code. For example (in VC++):

#pragma pack(push)      // save the current alignment
#pragma pack(1)         // set the alignment to one byte
// Define variables that are alignment sensitive
#pragma pack(pop)       // restore the alignment

Another option is to define your variables in other ways. Intrinsic types are not resized based on alignment, so instead of your 24-bit bitfield, another approach is to define your variable as an array of bytes.

Finally, you can just let the compilers make the structs whatever size they want and manually record the size that you need to read/write. As long as you're not concatenating structures together, this should work fine. Remember, however, that the compiler is giving you padded structs under the hood, so if you make a larger struct that includes, say, a works and a fails struct, there will be padded bits in between them that could cause you problems.

On most compilers, it's going to be darn near impossible to create a data type smaller than 8 bits. Most architectures just don't think that way. This shouldn't be a huge problem because most hardware devices that use datatypes of smaller than 8-bits end up arranging their packets in such a way that they still come in 8-bit multiples, so you can do the bit manipulations to extract or encode the values on the data stream as it leaves or comes in.

For all of the reasons listed above, a lot of code that works with hardware devices like this work with raw byte arrays and just encode the data within the arrays. Despite losing a lot of the conveniences of modern language constructs, it ends up just being easier.

回复收藏 0 原文