跨平台编程问题（文件I/O）

发布于 2024-08-09 08:00:12 字数 609 浏览 6 评论 0原文

我有一个 C++ 类，看起来有点像这样：

class BinaryStream : private std::iostream
{
    public:
        explicit BinaryStream(const std::string& file_name);
        bool read();
        bool write();

    private:
        Header m_hdr;
        std::vector<Row> m_rows;        
}

该类以二进制格式读取数据并将其写入磁盘。我没有使用任何特定于平台的编码 - 而是依赖于 STL。我在XP下编译成功。我想知道是否可以FTP在XP平台上编写的文件并在我的Linux机器上读取它们（一旦我在Linux上重新编译二进制流库）。

摘要：

使用为 XP 编译的跨平台库在 Xp 机器上创建的文件。
在 Linux 机器上编译相同的库（上面 1 中使用的）

问题：可以在 Linux 机器上读取上面 1 中创建的文件 (2) 吗？

如果不是，请解释为什么不，以及我如何解决这个问题。

原文

I have a C++ class that looks a bit like this:

class BinaryStream : private std::iostream
{
    public:
        explicit BinaryStream(const std::string& file_name);
        bool read();
        bool write();

    private:
        Header m_hdr;
        std::vector<Row> m_rows;        
}

This class reads and writes data in a binary format, to disk. I am not using any platform specific coding - relying instead on the STL. I have succesfully compiled on XP. I am wondering if I can FTP the files written on the XP platform and read them on my Linux machine (once I recompile the binary stream library on Linux).

Summary:

Files created on Xp machine using a cross platform library coompiled for XP.
Compile the same library (used in 1 above) on a Linux machine

Question: Can files created in 1 above, be read on a Linux machine (2) ?

If no, please explain why not, and how I may get around this issue.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

少女的英雄梦 2024-08-16 08:00:12

派生自 std::basic_streambuf。这就是他们存在的目的。请注意，大多数 STL 类并不是设计为派生的。我提到的这个是一个例外。

回复收藏 0 原文

醉酒的小男人 2024-08-16 08:00:12

这完全取决于二进制编码的细节。 Linux 与 XP 的不同之处在于，您更有可能发现自己处于大端平台上，如果您的二进制编码是特定于端的，您最终会遇到问题。

您还可能会遇到与行尾字符相关的问题。这里没有足够的信息来说明如何使用 ::std::iostream 来为您提供此问题的良好答案。

我强烈建议您查看 protobuf 库。它是一个用于创建快速跨平台二进制编码的优秀库。

回复收藏 0 原文

时光与爱终年不遇 2024-08-16 08:00:12

如果您希望代码可以在具有不同字节顺序的计算机之间移植，则需要坚持在文件中使用一种字节顺序。每当您读取或写入文件时，都会在主机字节顺序和文件字节顺序之间进行转换。当您想要编写可在所有计算机上移植的文件时，通常使用所谓的网络字节顺序。网络字节顺序被定义为大端，并且有一些预制函数来处理这些转换（尽管它们很容易自己编写）。

例如，在将 long 写入文件之前，应使用 htonl() 将其转换为网络字节顺序，而从文件读取时，应使用 ntohl() 将其转换回主机字节顺序。在大端系统上，htonl() 和 ntohl() 只是返回与传递给函数的数字相同的数字，但在小端系统上，它会交换变量中的每个字节。

如果您不关心支持大端系统，那么这都不是问题，尽管它仍然是很好的做法。

另一件需要注意的重要事情是您编写的结构/类的填充，如果您将它们直接写入文件（例如标题和行）。不同平台上的不同编译器可以使用不同的填充，这意味着变量在内存中的对齐方式不同。如果您在不同平台上使用的编译器使用不同的填充，这可能会严重破坏事情。因此，对于打算直接写入文件/其他流的结构，您应该始终指定填充。您应该告诉编译器像这样打包您的结构：

#pragma pack(push, 1)
struct Header {
  // This struct uses 1-byte padding
  ...
};
#pragma pack(pop)

请记住，当您在应用程序中使用该结构时，这样做会降低使用该结构的效率，因为访问未对齐的内存地址意味着系统需要做更多的工作。这就是为什么对于写入流的打包结构和您在应用程序中实际使用的类型（您只需将成员从一个复制到另一个）通常是一个好主意。

编辑。当然，解决这个问题的另一种方法是自己序列化这些结构，这不需要使用#pragma（编译指示是编译器相关的功能，尽管据我所知，所有主要编译器都支持编译指示包）。

If you want that your code is portable across machines with different endianess, you need to stick to using one endianess in your files. Whenever you read or write files, you do conversions between the host byte order, and the file byte order. It's common to use what you call network byte order when you want to write files that are portable across all machines. Network byte order is defined to be big endian, and there are pre-made functions made to deal with those conversions (although they are very easy to write yourself).

For example, before writing a long to a file, you should convert it to network byte order using htonl(), and when reading from a file you should convert it back to host byte order with ntohl(). On big-endian system htonl() and ntohl() simply return the same number as passed to the function, but on little-endian system it swaps each byte in the variable.

If you don't care about supporting big-endian systems, none of this is an issue though, although it's still good practice.

Another important thing to pay attention to is padding of your structs/classes that you write, if you write them directly to the file (eg. Header and Row). Different compilers on different platforms can use different padding, which means that variables are aligned differently in the memory. This can break things big-time, if the compilers you use on different platform use different padding. So for structs that you intend to write directly to files/other streams, you should always specify padding. You should tell the compiler to pack your structs like this:

#pragma pack(push, 1)
struct Header {
  // This struct uses 1-byte padding
  ...
};
#pragma pack(pop)

Remember that doing this will make using the struct more inefficient when you use it in your application, because access to unaligned memory addresses means more work for the system. This is why it's generally a good idea to have separate types for the packed structs that you write to streams, and a type that you actually use in the application (you just copy the members from one to other).

EDIT. Another way to deal with the issue, of course, is to serialize those structs yourself, which won't require using #pragma (pragmas are compiler-dependent feature, although all major compilers to my knowledge supports the pragma pack).

回复收藏 0 原文