Visual Studio C++ 2008 操纵字节?

发布于 2024-08-09 22:43:09 字数 968 浏览 4 评论 0原文

我正在尝试将严格的二进制数据写入文件(无编码)。问题是,当我十六进制转储文件时,我注意到相当奇怪的行为。使用以下任一方法构建文件都会产生相同的行为。我什至使用 System::Text::Encoding::Default 来测试流。

StreamWriter^ binWriter = gcnew StreamWriter(gcnew FileStream("test.bin",FileMode::Create));

(Also used this method)
FileStream^ tempBin = gcnew FileStream("test.bin",FileMode::Create);
BinaryWriter^ binWriter = gcnew BinaryWriter(tempBin);


binWriter->Write(0x80);
binWriter->Write(0x81);
.
.
binWriter->Write(0x8F);
binWriter->Write(0x90);
binWriter->Write(0x91);
.
.
binWriter->Write(0x9F);

写入该字节序列时,我注意到在十六进制转储中未转换为0x3F的唯一字节是0x810x8D0x900x9D,...我不知道为什么。

我也尝试过制作字符数组,也出现了类似的情况。即,

array<wchar_t,1>^ OT_Random_Delta_Limits = {0x00,0x00,0x03,0x79,0x00,0x00,0x04,0x88};
binWriter->Write(OT_Random_Delta_Limits);

0x88 将被写为0x3F

I'm trying to write strictly binary data to files (no encoding). The problem is, when I hex dump the files, I'm noticing rather weird behavior. Using either one of the below methods to construct a file results in the same behavior. I even used the System::Text::Encoding::Default to test as well for the streams.

StreamWriter^ binWriter = gcnew StreamWriter(gcnew FileStream("test.bin",FileMode::Create));

(Also used this method)
FileStream^ tempBin = gcnew FileStream("test.bin",FileMode::Create);
BinaryWriter^ binWriter = gcnew BinaryWriter(tempBin);


binWriter->Write(0x80);
binWriter->Write(0x81);
.
.
binWriter->Write(0x8F);
binWriter->Write(0x90);
binWriter->Write(0x91);
.
.
binWriter->Write(0x9F);

Writing that sequence of bytes, I noticed the only bytes that weren't converted to 0x3F in the hex dump were 0x81,0x8D,0x90,0x9D, ... and I have no idea why.

I also tried making character arrays, and a similar situation happens. i.e.,

array<wchar_t,1>^ OT_Random_Delta_Limits = {0x00,0x00,0x03,0x79,0x00,0x00,0x04,0x88};
binWriter->Write(OT_Random_Delta_Limits);

0x88 would be written as 0x3F.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

愚人国度 2024-08-16 22:43:09

如果您想坚持使用二进制文件,那么不要使用StreamWriter。只需使用 FileStreamWrite/WriteByte 即可。 StreamWriter(以及一般的 TextWriter)是专门为文本设计的。无论您是否需要编码,都会应用一种编码 - 因为当您调用 StreamWriter.Write 时,写入的是 char,而不是 byte代码>.

也不要创建 wchar_t 值数组 - 同样,这些值用于字符,即文本。

BinaryWriter.Write 应该对您有用,除非它将值提升为 char 在这种情况下您会遇到完全相同的问题。

顺便说一句,如果不指定任何编码,我希望您获得非 0x3F 值,而是代表这些字符的 UTF-8 编码值的字节。

当您指定 Encoding.Default 时,您会看到任何不属于该编码的 Unicode 值都是 0x3F。

不管怎样,当你想处理二进制数据而不是文本时,基本的教训是坚持使用Stream

编辑:好的,它会是这样的:

public static void ConvertHex(TextReader input, Stream output)
{
    while (true)
    {
        int firstNybble = input.Read();
        if (firstNybble == -1)
        {
            return;
        }
        int secondNybble = input.Read();
        if (secondNybble == -1)
        {
            throw new IOException("Reader finished half way through a byte");
        }
        int value = (ParseNybble(firstNybble) << 4) + ParseNybble(secondNybble);
        output.WriteByte((byte) value);
    }
}

// value would actually be a char, but as we've got an int in the above code,
// it just makes things a bit easier
private static int ParseNybble(int value)
{
    if (value >= '0' && value <= '9') return value - '0';
    if (value >= 'A' && value <= 'F') return value - 'A' + 10;
    if (value >= 'a' && value <= 'f') return value - 'a' + 10;
    throw new ArgumentException("Invalid nybble: " + (char) value);
}

这在缓冲等方面非常低效,但应该可以让您开始。

If you want to stick to binary files then don't use StreamWriter. Just use a FileStream and Write/WriteByte. StreamWriters (and TextWriters in generally) are expressly designed for text. Whether you want an encoding or not, one will be applied - because when you're calling StreamWriter.Write, that's writing a char, not a byte.

Don't create arrays of wchar_t values either - again, those are for characters, i.e. text.

BinaryWriter.Write should have worked for you unless it was promoting the values to char in which case you'd have exactly the same problem.

By the way, without specifying any encoding, I'd expect you to get non-0x3F values, but instead the bytes representing the UTF-8 encoded values for those characters.

When you specified Encoding.Default, you'd have seen 0x3F for any Unicode values not in that encoding.

Anyway, the basic lesson is to stick to Stream when you want to deal with binary data rather than text.

EDIT: Okay, it would be something like:

public static void ConvertHex(TextReader input, Stream output)
{
    while (true)
    {
        int firstNybble = input.Read();
        if (firstNybble == -1)
        {
            return;
        }
        int secondNybble = input.Read();
        if (secondNybble == -1)
        {
            throw new IOException("Reader finished half way through a byte");
        }
        int value = (ParseNybble(firstNybble) << 4) + ParseNybble(secondNybble);
        output.WriteByte((byte) value);
    }
}

// value would actually be a char, but as we've got an int in the above code,
// it just makes things a bit easier
private static int ParseNybble(int value)
{
    if (value >= '0' && value <= '9') return value - '0';
    if (value >= 'A' && value <= 'F') return value - 'A' + 10;
    if (value >= 'a' && value <= 'f') return value - 'a' + 10;
    throw new ArgumentException("Invalid nybble: " + (char) value);
}

This is very inefficient in terms of buffering etc, but should get you started.

救赎№ 2024-08-16 22:43:09

使用流初始化的 BinaryWriter() 类将对写入的任何字符或字符串使用默认编码 UTF8。我猜测这些

binWriter->Write(0x80);
binWriter->Write(0x81);
.
.
binWriter->Write(0x8F);
binWriter->Write(0x90);
binWriter->Write(0x91);

调用绑定到 Write( char) 重载,因此它们将通过字符编码器。我对 C++/CLI 不太熟悉,但在我看来,这些调用应该绑定到 Write(Int32) ,这不应该有这个问题(也许你的代码真的在调用 < code>Write() 并使用设置为示例中的值的 char 变量来解释此行为)。

A BinaryWriter() class initialized with a stream will use a default encoding of UTF8 for any chars or strings that are written. I'm guessing that the

binWriter->Write(0x80);
binWriter->Write(0x81);
.
.
binWriter->Write(0x8F);
binWriter->Write(0x90);
binWriter->Write(0x91);

calls are binding to the Write( char) overload so they're going through the character encoder. I'm not very familiar with C++/CLI, but it seems to me that these calls should be binding to Write(Int32), which shouldn't have this problem (maybe your code is really calling Write() with a char variable that's set to the values in your example. That would account for this behavior).

落墨 2024-08-16 22:43:09

0x3F 通常称为 ASCII 字符“?”;映射到它的字符是没有可打印表示的控制字符。正如 Jon 指出的,对于原始二进制数据使用二进制流而不是面向文本的输出机制。

编辑——实际上你的结果看起来与我的预期相反。在默认的代码页 1252 中,不可打印的字符(即可能映射的字符)到“?”)在该范围内是 0x81、0x8D、0x8F、0x90 和 0x9D

0x3F is commonly known as the ASCII character '?'; the characters that are mapping to it are control characters with no printable representation. As Jon points out, use a binary stream rather than a text-oriented output mechanism for raw binary data.

EDIT -- actually your results look like the inverse of what I would expect. In the default code page 1252, the non-printable characters (i.e. ones likely to map to '?') in that range are 0x81, 0x8D, 0x8F, 0x90 and 0x9D

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文