(byte)Convert.ToChar(anyStringOfLengthOne) 怎么可能抛出错误?

发布于 2024-12-19 03:39:35 字数 605 浏览 1 评论 0原文

我们在项目中有这样一段相当简单的代码:

string input = "Any string";
for (int i = 0; i < input.Length; i++)
{
    string stringOfLengthOne = input.Substring(i, 1);
    byte value = (byte)Convert.ToChar(stringOfLengthOne);
    if (value == someValue)
    {
        // do something
    }
}

输入是一个字符串,其中的字符通常从文件中读取,需要根据其字节值进行处理。

不幸的是,我们没有机会逐步调试这个过程,我们只需要有根据地猜测哪种字符串可能会导致

 (byte)Convert.ToChar(anyStringOfLengthOne)

上面的代码抛出“算术运算导致溢出”错误。

我的想法是,一旦我有了一个字符串,就应该总是可以 1. 选择一个字符并 2. 将其转换为字节。然而错误还是发生了。

有什么想法、提示吗?或者有人甚至可以提供一个引发此类错误的字符串?

We have this rather simple code in a project:

string input = "Any string";
for (int i = 0; i < input.Length; i++)
{
    string stringOfLengthOne = input.Substring(i, 1);
    byte value = (byte)Convert.ToChar(stringOfLengthOne);
    if (value == someValue)
    {
        // do something
    }
}

The input is a string with characters usually read from a file that need to be processed depending on their byte value.

Unfortunately, we do not have the chance to debug this process step-by-step, we just need to make an educated guess what kind of string could cause

 (byte)Convert.ToChar(anyStringOfLengthOne)

in the code above to throw an "Arithmetic operation resulted in an overflow" error.

My thinking is that as soon as I have a string, it should always be possible to 1. pick a char and 2. convert it to a byte. Yet the error occurs.

Any ideas, hints? Or can someone even provide a string that throws this kind of error?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

迷荒 2024-12-26 03:39:35

.Net 中的字符长度为 16 位(短/超短)。

C# 的默认项目设置意味着强制转换将起作用,并且只会忽略大于 255 的任何字符的高位,即类似于使用 (byte) (c & 0xff)

但是,如果您使用检查算术,尝试转换大于 255 的 char 将导致 ArithmeticOverflowExcetion。

算术的默认设置可以在项目的构建设置中设置为选中/取消选中。

示例

char c = (char) 300;
byte b = unchecked ((byte) c);
Console.WriteLine (b);

// Result: 44

char c = (char) 300;
byte b = checked ((byte) c);
Console.WriteLine (b);

// Result: ArithmeticOverflowExcetion

替代方案

或者,您可以直接比较字符。

例如,要测试一个字符是否为 0-9

char c = input[i];
if (c >= '0' && c <= '9') {
    // do something
}

您甚至可以将 char 与 int 进行比较

char c = input[i];
if (c >= 48 && c <= 57) {
    // do something
}

Characters in .Net are 16 bits (short/ushort) in length.

The default project settings for C# means that the cast would work and will just ignore the higher bits for any character that is larger than 255, i.e. like using (byte) (c & 0xff).

However, if you are using checked arithmetic, trying to cast a char that is greater than 255 will result in an ArithmeticOverflowExcetion.

The default setting for arithmetic can be set to checked/unchecked in the project's build settings.

Example

char c = (char) 300;
byte b = unchecked ((byte) c);
Console.WriteLine (b);

// Result: 44

char c = (char) 300;
byte b = checked ((byte) c);
Console.WriteLine (b);

// Result: ArithmeticOverflowExcetion

Alternative

Alternativly, you could compare the characters directly.

For example to test if a character is 0-9

char c = input[i];
if (c >= '0' && c <= '9') {
    // do something
}

You can even compare a char to an int

char c = input[i];
if (c >= 48 && c <= 57) {
    // do something
}
秋凉 2024-12-26 03:39:35

为什么不访问 input[i] 而使用 Substring 和 Convert?

编辑:

哦,哦,抱歉,我错过了。 .NET (Unicode) 中的字符是 16 位,因此如果您使用非英语字符,则无法将字符转换为字节是很合理的。例如,尝试任何希伯来字母。

Why not access input[i] instead of using a Substring and Convert?

EDIT:

Oh, oh, sorry, I missed it. Characters are 16 bit in .NET (Unicode), so it's very reasonable you can't convert a char to a byte if you're using non English characters. Try any Hebrew letter for instance.

氛圍 2024-12-26 03:39:35

来自文档

字符串中的每个字符都由 Unicode 标量值定义,并且
称为 Unicode 代码点或序数(数字)值
统一码字符。每个代码点均使用 UTF-16 进行编码
编码,编码的每个元素的数值为
由 Char 对象表示。

字节是 8 位,UTF-16 是 16 位,这就是你收到错误的原因。

From docs

Each character in a string is defined by a Unicode scalar value, also
called a Unicode code point or the ordinal (numeric) value of the
Unicode character. Each code point is encoded by using UTF-16
encoding, and the numeric value of each element of the encoding is
represented by a Char object.

Byte is 8 bits, UTF-16 is 16 bits, this is why you get an error.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文