SED 命令获取 x 行和 y 行之间的第 n 个制表符分隔值

发布于 2024-10-04 16:44:42 字数 528 浏览 6 评论 0原文

我已经能够从一个大型制表符分隔的文本文件中提取某些行并将它们写入另一个文件：

sed -n 100,200p file.tsv >> output.txt

但是，我实际上试图从每行获取第 8 个制表符分隔值并将它们写入到一个以逗号分隔的文件，但尽管阅读了数十篇在线文章，但我找不到用于模式匹配的正确语法。

每次我基本上都在尝试匹配

/([^\t]*\t){7}([0-9]*).*/$2 >

没有运气。

文本文件 file.tsv 中的行类似于：

01  name1   title1  summary1    desc1   image1  url1    120019  time1
02  name2   title2  summary2    desc2   image2  url2    576689  time2

请问任何人都可以帮助我完成此查询吗？

原文

I have been able to extract certain lines from a large tab-separated text file and write them to another file:

sed -n 100,200p file.tsv >> output.txt

However, I am actually trying to grab the 8th tab-separated value from each line and write them to a file comma separated, but I cannot find the right syntax to use for the pattern matching, despite reading dozens of online articles.

For each time I have basically been trying to match

$2 in /([^\t]*\t){7}([0-9]*).*/

with no luck.

The lines within the text file file.tsv resemble:

01  name1   title1  summary1    desc1   image1  url1    120019  time1
02  name2   title2  summary2    desc2   image2  url2    576689  time2

Please can anyone help me with this query?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

一张白纸 2024-10-11 16:44:42

我想我宁愿这样使用 awk：

$ awk '{ print col 8 : $8 }' file

我想未来的工作会更容易。

I think I would rather use awk that way:

$ awk '{ print col 8 : $8 }' file

The forward work will be easier I guess.

回复收藏 0 原文

梦旅人picnic 2024-10-11 16:44:42

Perl 一行代码：

perl -F'\t' -ane 'push @csv, $F[7] if $. > 100 && $. < 200; END { print join ",", @csv if @csv }' /path/to/input/file > /path/to/output/file

A Perl one-liner:

perl -F'\t' -ane 'push @csv, $F[7] if $. > 100 && $. < 200; END { print join ",", @csv if @csv }' /path/to/input/file > /path/to/output/file

回复收藏 0 原文

聆听风音 2024-10-11 16:44:42

这里它使用 GNU sed 和扩展表达式：

sed -nre '100,200{s/^(\S+\s+){7}(\S+).*$/\2/;p}' file.tsv

这里它仅使用 POSIX：

sed -n '100,200{s/^\([^[:space:]]\+[[:space:]]\+\)\{7\}\([^[:space:]]\+\).*$/\2/;p}' file.tsv

我确实同意 Alf 的观点，即 awk 更适合于此。

这是带有行限制的 awk 解决方案：

awk 'NR==100,NR==200{print $8}' file.tsv

Here it is using GNU sed and extended expressions:

sed -nre '100,200{s/^(\S+\s+){7}(\S+).*$/\2/;p}' file.tsv

Here it is using POSIX only:

sed -n '100,200{s/^\([^[:space:]]\+[[:space:]]\+\)\{7\}\([^[:space:]]\+\).*$/\2/;p}' file.tsv

I do agree with Alf that awk would be a better fit for this.

Here is the awk solution with line limits:

awk 'NR==100,NR==200{print $8}' file.tsv

回复收藏 0 原文

悲凉≈ 2024-10-11 16:44:42

如果有空字段，这将起作用。

sed -nre '100,200{s/^(([^\t]*)\t){7}([^\t]*)(\t.*|$)/\3/;p}' file.tsv

This will work if there are empty fields.

sed -nre '100,200{s/^(([^\t]*)\t){7}([^\t]*)(\t.*|$)/\3/;p}' file.tsv

回复收藏 0 原文

~没有更多了~

关于作者

套路撩心

暂无简介

0 文章

0 评论

22 人气

关注发私信

浪漫人生路

文章 0 评论 0

关注

620vip

文章 0 评论 0

关注

羞稚

文章 0 评论 0

关注

走过海棠暮

文章 0 评论 0

关注

你好刘可爱

文章 0 评论 0

关注

陌若浮生

文章 0 评论 0

友情链接

文江博客

SED 命令获取 x 行和 y 行之间的第 n 个制表符分隔值

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

浪漫人生路

620vip

羞稚

走过海棠暮

你好刘可爱

陌若浮生

友情链接

SED 命令获取 x 行和 y 行之间的第 n 个制表符分隔值

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

浪漫人生路

620vip

羞稚

走过海棠暮

你好刘可爱

陌若浮生

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。