在 Perl 中，如何只分割字符串的某个前导部分？

发布于 2024-12-11 17:15:56 字数 132 浏览 0 评论 0原文

我正在解析一个包含长行的文件，其标记以空格分隔。在处理大部分行之前，我想检查第 n 个（对于小 n）标记是否具有某些值。我将跳过大部分行，所以实际上没有必要分割大部分很长的行。有没有一种快速的方法可以在 Perl 中进行惰性分割，或者我需要自己动手？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

太阳哥哥 2024-12-18 17:15:56

您可以向 split 运算符提供 limit 参数，以使 Perl 在生成一定数量的令牌后停止拆分。

@fields = split /\s+/, $expression, 4

例如，会将所有内容放在 @list 的第四个元素中的第三个空格分隔字段之后。当表达式具有四个以上字段时，这比执行完全拆分更有效。

如果您执行此惰性拆分并决定需要进一步处理该行，则需要再次拆分该行。根据生产线的长度以及您需要重新处理它们的频率，您仍然可以领先。

另一种方法可能是分割您感兴趣的行的一部分。例如，如果该行包含许多字段，但您想过滤第 4 个字段，并且您确定第 4 个字段始终出现在第 100 个字节之前行，说

@fields = split /\s+/, substr($expression, 0, 100);
if (matches_some_condition($line[3])) {
    # process the whole line
    @fields = split /\s+/, $expression;
    ...
}

偶尔将表达式拆分两次可能比总是拆分完整表达式一次更有效。

You can provide a limit argument to the split operator to make Perl stop splitting after a certain number of tokens have been generated.

@fields = split /\s+/, $expression, 4

for example, will put everything after the 3rd whitespace-separated field in the 4th element of @list. This is more efficient than doing a complete split when the expression has more than four fields.

If you do this lazy split and decide that you need to process the line further, you will need to split the line again. Depending on how long the lines are and how frequently you need to reprocess them, you could still come out ahead.

Another approach may be to split a portion of the line you are interested in. For example, if the line contains many fields but you want to filter on the 4th field AND you are sure that the 4th field always occurs before the 100th byte on the line, saying

@fields = split /\s+/, substr($expression, 0, 100);
if (matches_some_condition($line[3])) {
    # process the whole line
    @fields = split /\s+/, $expression;
    ...
}

and occasionally splitting the expression twice may be more efficient than always splitting the full expression one time.

回复收藏 0 原文

泼猴你往哪里跑 2024-12-18 17:15:56

perldoc -f split：

如果指定了 LIMIT 并且为正数，则表示 EXPR 将被分割成的最大字段数，但实际返回的字段数取决于 EXPR 中 PATTERN 匹配的次数。

my $nth = (split ' ', $line, $n + 1)[$n - 1];

perldoc -f split:

If LIMIT is specified and positive, it represents the maximum number of fields the EXPR will be split into, though the actual number of fields returned depends on the number of times PATTERN matches within EXPR.

my $nth = (split ' ', $line, $n + 1)[$n - 1];

回复收藏 0 原文

~没有更多了~

关于作者

你的笑

暂无简介

0 文章

0 评论

23 人气

关注发私信

友情链接

文江博客

在 Perl 中，如何只分割字符串的某个前导部分？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

醉城メ夜风

远昼

平生欢

微凉

Honwey

qq_ikhFfg

友情链接

在 Perl 中，如何只分割字符串的某个前导部分？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

醉城メ夜风

远昼

平生欢

微凉

Honwey

qq_ikhFfg

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。