如何让 grep 在 bash 中“读取行时”迭代工作功能?

发布于 2025-01-12 19:32:15 字数 841 浏览 3 评论 0原文

我有一个文本文件 (capital_names.txt),其中包含如下行:

Warsaw_  
London_  
Oslo_  
...

在另一个文件 (capital_info.txt) 中,我有以下几行:

London_1_  
London_2  
cityLondon_3  
capitalWarsaw_1  
Warsaw_2  
...

我想编写一个 shell 脚本仅当大写名称采用以下格式“Name_”时才会对其进行 grep。
所需的输出是这样的多个文件:

$ cat Warsaw_output.txt  
Warsaw_2

$ cat London_output.txt   
London_1  
London_2  

这是脚本的关键部分:

$outp=$"output"  
while read line; do  
grep ^$line capital_info.txt > $line$outp  
done < capital_names.txt

但是,输出文件是空的(0 字节)并且具有以下名称:

'Warsaw_$'\r''output'  
'London_'$'\r''output'

当我运行单个命令时 (grep -f ^"London_" Capital_info.txt)一切正常,但我无法手动对 Capital_names.txt 中的 50000 个条目执行此操作。我该如何解决这个问题?

I have a text file (capital_names.txt) containing lines like these:

Warsaw_  
London_  
Oslo_  
...

In another file (capital_info.txt) I have the following lines:

London_1_  
London_2  
cityLondon_3  
capitalWarsaw_1  
Warsaw_2  
...

I wanted to write a shell script that greps capital names only if they are in the following format "Name_".
Desired outputs are multiple files like these:

$ cat Warsaw_output.txt  
Warsaw_2

$ cat London_output.txt   
London_1  
London_2  

Here is the key part of the script:

$outp=
quot;output"  
while read line; do  
grep ^$line capital_info.txt > $line$outp  
done < capital_names.txt

However, the output files are empty (0 bytes) and have the following names:

'Warsaw_

When I run individual commands (grep -f ^"London_" capital_info.txt) everything works but I cannot do it for 50000 entries in capital_names.txt manually. How can I solve this issue?

\r''output' 'London_'

When I run individual commands (grep -f ^"London_" capital_info.txt) everything works but I cannot do it for 50000 entries in capital_names.txt manually. How can I solve this issue?

\r''output'

When I run individual commands (grep -f ^"London_" capital_info.txt) everything works but I cannot do it for 50000 entries in capital_names.txt manually. How can I solve this issue?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

剩一世无双 2025-01-19 19:32:15

根据您显示的示例和尝试,请尝试以下 awk 代码。在 GNU awk 中编写和测试,应该可以在任何 awk 中工作。

awk '
BEGIN  { FS=OFS="_" }
FNR==NR{
  arr[$1]
  next
}
($1 in arr) && $2~/^[0-9]+$/{
  outFile=($1"_output.txt")
  if(prev!=outFile){ close(prev) }
  print ( $1,$2 ) > (outFile)
  prev=outFile
}
' capital_names.txt capital_info.txt

说明:为上述内容添加详细说明。

awk '                                  ##Starting awk program from here.
BEGIN  { FS=OFS="_" }                  ##In BEGIN section of awk setting FS and OFS as _ here.
FNR==NR{                               ##Checking condition FNR==NR then do following.
  arr[$1]                              ##Creating array arr with index of $1.
  next                                 ##next will skip all further statements from here.
}
($1 in arr) && $2~/^[0-9]+$/{          ##Checking if $1 is in arr AND 2nd field is digits.
  outFile=($1"_output.txt")            ##Creating outFile which has output file name in it.
  if(prev!=outFile){ close(prev) }     ##Checking if previous output file name is NOT same as current output file name then close previous one, to avoid too many open files error.
  print ( $1,$2 ) > (outFile)          ##printing 1st and 2nd field to outFile here.
  prev=outFile                         ##Setting prev to outFile value here.
}
' capital_names.txt capital_info.txt   ##Mentioning Input_file names here.

With your shown samples and attempts, please try following awk code. Written and tested in GNU awk, should work in any awk.

awk '
BEGIN  { FS=OFS="_" }
FNR==NR{
  arr[$1]
  next
}
($1 in arr) && $2~/^[0-9]+$/{
  outFile=($1"_output.txt")
  if(prev!=outFile){ close(prev) }
  print ( $1,$2 ) > (outFile)
  prev=outFile
}
' capital_names.txt capital_info.txt

Explanation: Adding detailed explanation for above.

awk '                                  ##Starting awk program from here.
BEGIN  { FS=OFS="_" }                  ##In BEGIN section of awk setting FS and OFS as _ here.
FNR==NR{                               ##Checking condition FNR==NR then do following.
  arr[$1]                              ##Creating array arr with index of $1.
  next                                 ##next will skip all further statements from here.
}
($1 in arr) && $2~/^[0-9]+$/{          ##Checking if $1 is in arr AND 2nd field is digits.
  outFile=($1"_output.txt")            ##Creating outFile which has output file name in it.
  if(prev!=outFile){ close(prev) }     ##Checking if previous output file name is NOT same as current output file name then close previous one, to avoid too many open files error.
  print ( $1,$2 ) > (outFile)          ##printing 1st and 2nd field to outFile here.
  prev=outFile                         ##Setting prev to outFile value here.
}
' capital_names.txt capital_info.txt   ##Mentioning Input_file names here.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文