当前位置：文江博客话题详情

如何使用 Perl 在文件中查找扩展 ASCII 字符？

发布于 2024-07-20 06:19:39 字数 71 浏览 4 评论 0原文

如何使用 Perl 在文件中查找扩展 ASCII 字符？谁能拿到剧本吗？

.....提前致谢.....

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

寄人书 2024-07-27 06:19:39

由于扩展的 ASCII 字符具有值 128 及更高，您只需调用 ord 单个字符并处理那些值 >= 128 的字符。以下代码从 stdin 读取并仅打印扩展 ASCII 字符：

while (<>) {
  while (/(.)/g) {
    print($1) if (ord($1) >= 128);
  }
}

或者，解压与 chr< /a> 也可以工作。示例：（

while (<>) {
  foreach (unpack("C*", $_)) {
    print(chr($_)) if ($_ >= 128);
  }
}

我确信某些 Perl 大师可以将这两者压缩为两个单行代码...）

要打印行号，您可以使用以下内容（这不会删除重复项，并且在使用 unicode 时会出现奇怪的行为已通过）：（

while (<>) {
  while (/(.)/g) {
    print($. . "\n") if (ord($1) >= 128);
  }
}

感谢 Yaakov Belch 提供的 $. 提示。）

Since the extended ASCII characters have value 128 and higher, you can just call ord on individual characters and handle those with a value >= 128. The following code reads from stdin and prints only the extended ASCII characters:

while (<>) {
  while (/(.)/g) {
    print($1) if (ord($1) >= 128);
  }
}

Alternatively, unpack together with chr will also work. Example:

while (<>) {
  foreach (unpack("C*", $_)) {
    print(chr($_)) if ($_ >= 128);
  }
}

(I'm sure some Perl guru can condense both of these to two one-liners...)

To print the line numbers instead, you can use the following (this does not remove duplicates, and will have odd behaviour when unicode is passed):

while (<>) {
  while (/(.)/g) {
    print($. . "\n") if (ord($1) >= 128);
  }
}

(Thanks Yaakov Belch for the $. tip.)

回复收藏 0 原文

攒眉千度 2024-07-27 06:19:39

第一个可打印 ASCII 字符是空格 (32)。最后一个可打印的 ASCII 字符是 ~ (126)。所以我可能会使用，

while (<>) {
  print "$.\n" if /[^ -~]/;
}

尽管不可否认，它也会显示包含控制字符和扩展 ASCII 的行。

编辑：更改为打印行号而不是行本身。

The first printable ASCII character is space (32). The last printable ASCII character is ~ (126). So I'd probably use

while (<>) {
  print "$.\n" if /[^ -~]/;
}

although it will, admittedly, also display lines containing control characters as well as extended ASCII.

Edit: Changed to print the line number rather than the line itself.

回复收藏 0 原文

往日情怀 2024-07-27 06:19:39

Oneliner：

perl -nE'say$.if/[\xE0-\xFF]/'

适用于较旧的 Perl 版本

perl -lne'print$.if/[\xE0-\xFF]/'

Oneliner:

perl -nE'say$.if/[\xE0-\xFF]/'

for older perl versions

perl -lne'print$.if/[\xE0-\xFF]/'

回复收藏 0 原文

蓝戈者 2024-07-27 06:19:39

一个关键的问题是是否

使用字节；

pragma 应该有效。海报应该决定这一点。要选取代码大于 127 的字符，以下内容就足够了：

print grep 127 < ord, split // while <>;

或

print grep /[^[:ascii:]]/, split // while <>;

A crucial question is whether the

use bytes;

pragma should be in effect. The poster should decide that. For picking characters with codes greater than 127, the following will suffice:

print grep 127 < ord, split // while <>;

print grep /[^[:ascii:]]/, split // while <>;

回复收藏 0 原文

ㄟ。诗瑗 2024-07-27 06:19:39

Hynek -Pichi- Vychodil 的回答：

perl -nE'say$.if/[\xE0-\xFF]/'

只测试非打印的有限部分，大概应该是

perl -nE'say$.if/[\x80-\xFF]/'

相反。

Hynek -Pichi- Vychodil's answer:

perl -nE'say$.if/[\xE0-\xFF]/'

only tests a limited part of the non-printing should presumably be

perl -nE'say$.if/[\x80-\xFF]/'

instead.

回复收藏 0 原文

━╋う一瞬間旳綻放 2024-07-27 06:19:39

grep 怎么样？

grep [\x00-\x1F\x7F-\xFF]+ *

What about grep?

grep [\x00-\x1F\x7F-\xFF]+ *

回复收藏 0 原文

~没有更多了~

关于作者

晒暮凉

暂无简介

0 文章

0 评论

24 人气

关注发私信

友情链接

文江博客

如何使用 Perl 在文件中查找扩展 ASCII 字符？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

关于作者

相关话题

热门标签

推荐作者

1CH1MKgiKxn9p

ゞ记忆︶ㄣ

JackDx

信远

yaoduoduo1995

霞映澄塘

友情链接

如何使用 Perl 在文件中查找扩展 ASCII 字符？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

关于作者

相关话题

热门标签

推荐作者

1CH1MKgiKxn9p

ゞ记忆︶ㄣ

JackDx

信远

yaoduoduo1995

霞映澄塘

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。