C++ 扩展 Ascii 字符

发布于 2024-07-14 05:23:07 字数 47 浏览 6 评论 0原文

如何检测 C++ 字符数组中是否存在扩展 ASCII 值(128 到 255)。

How to detect the presence of Extended ASCII values (128 to 255) in a C++ character array.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

拿命拼未来 2024-07-21 05:23:07

请记住,不存在扩展 ASCII 之类的东西。 ASCII 过去和现在都只定义在 0 到 127 之间。上面的所有内容要么无效,要么需要采用 ASCII 以外的已定义编码(例如 ISO-8859-1)。

请阅读每个软件开发人员绝对必须了解 Unicode 和字符集的绝对最低限度(没有任何借口!)

除此之外:迭代它并检查任何值有什么问题> 127(或使用有符号char时<0)?

Please remember that there is no such thing as extended ASCII. ASCII was and is only defined between 0 and 127. Everything above that is either invalid or needs to be in a defined encoding other than ASCII (for example ISO-8859-1).

Please read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!).

Other than that: what's wrong with iterating over it and check for any value > 127 (or <0 when using signed chars)?

一袭水袖舞倾城 2024-07-21 05:23:07

Char 可以有符号或无符号。 不过,这并不重要。 您实际上想要检查每个字符是否是有效的 ASCII。 这是一个积极的、明确的检查。 您只需检查每个字符是否都 >=0 且 <= 127。任何其他字符(无论是正数还是负数、“扩展 ASCII”或 UTF-8)均无效。

Char can be signed or unsigned. This doesn't really matter, though. You actually want to check if each character is valid ASCII. This is a positive, non-ambiguous check. You simply check if each char is both >=0 and <= 127. Anything else (whether positive or negative, "Extended ASCII" or UTF-8) is invalid.

小苏打饼 2024-07-21 05:23:07

现在没人用isascii了吗?

char c = (char) 200;

if (isascii(c))
{
    cout << "it's ascii!" << endl;
}
else
{
    cout << "it's not ascii!" << endl;
}

Doesn't anyone use isascii anymore?

char c = (char) 200;

if (isascii(c))
{
    cout << "it's ascii!" << endl;
}
else
{
    cout << "it's not ascii!" << endl;
}
病毒体 2024-07-21 05:23:07

确保您知道相关机器的字节顺序,只需使用按位与掩码检查最高位即可:

if (ch & 128) {
  // high bit is set
} else {
  // looks like a 7-bit value
}

但是您可能应该为此使用一些语言环境函数。 更好的是,知道数据的字符编码是什么。 尝试猜测它就像尝试猜测进入数据库字段的数据格式一样。 它可能会进去,但进去是垃圾,出去也是垃圾。

Make sure you know the endianness of the machine in question, and just check the highest bit with a bitwise AND mask:

if (ch & 128) {
  // high bit is set
} else {
  // looks like a 7-bit value
}

But there are probably locale functions you should be using for this. Better yet, KNOW what character encoding data is coming in as. Trying to guess it is like trying to guess the format of data going into your database fields. It might go in, but garbage in, garbage out.

呆萌少年 2024-07-21 05:23:07

迭代数组并检查每个字符是否不在 128 到 255 范围内?

Iterate over array and check that each character doesn't fall in 128 to 255 range?

夏九 2024-07-21 05:23:07

检查值是否不是负数

Check the values that they are not negative

巴黎盛开的樱花 2024-07-21 05:23:07
bool detect(const signed char* x) {
  while (*x++ > 0);
  return x[-1];
}
bool detect(const signed char* x) {
  while (*x++ > 0);
  return x[-1];
}
冷了相思 2024-07-21 05:23:07
(char) c = (char) 200;

if (isascii(c))
{
    cout << "it's ascii!" << endl;
}
else
{
    cout << "it's not ascii!" << endl;
}

试试这个代码

(char) c = (char) 200;

if (isascii(c))
{
    cout << "it's ascii!" << endl;
}
else
{
    cout << "it's not ascii!" << endl;
}

try this code

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文