如何编写 sed 脚本来从文本文件中 grep 信息
我正在尝试做我的作业,仅限于使用 sed 将输入文件过滤为某种输出格式。这是输入文件(名为 stocks
):
Symbol;Name;Volume
================================================
BAC;Bank of America Corporation Com;238,059,612
CSCO;Cisco Systems, Inc.;28,159,455
INTC;Intel Corporation;22,501,784
MSFT;Microsoft Corporation;23,363,118
VZ;Verizon Communications Inc. Com;5,744,385
KO;Coca-Cola Company (The) Common;3,752,569
MMM;3M Company Common Stock;1,660,453
================================================
输出需要是:
BAC, CSCO, INTC, MSFT, VZ, KO, MMM
我确实想出了一个解决方案,但效率不高。这是我的 sed
脚本(名为 try.sed
):
/.*;.*;[0-9].*/ { N
N
N
N
N
N
s/\(.*\);.*;.*\n\(.*\);.*;.*\n\(.*\);.*;.*\n\(.*\);.*;.*\n\(.*\);.*;.*\n\(.*\);.*;.*\n\(.*\);.*;.*/\1, \2, \3, \4, \5, \6, \7/gp
}
我在 shell 上运行的命令是:
$ sed -nf try.sed stocks
我的问题是,是否有更好的方法使用 sed 来获取相同的结果?我写的脚本只能处理 7 行数据。如果数据较长,我需要重新修改我的脚本。我不知道如何才能让它变得更好,所以我在这里寻求帮助!
感谢您的任何建议。
I'm trying to do my homework that is restricted to only using sed
to filter an input file to a certain format of output. Here is the input file (named stocks
):
Symbol;Name;Volume
================================================
BAC;Bank of America Corporation Com;238,059,612
CSCO;Cisco Systems, Inc.;28,159,455
INTC;Intel Corporation;22,501,784
MSFT;Microsoft Corporation;23,363,118
VZ;Verizon Communications Inc. Com;5,744,385
KO;Coca-Cola Company (The) Common;3,752,569
MMM;3M Company Common Stock;1,660,453
================================================
And the output needs to be:
BAC, CSCO, INTC, MSFT, VZ, KO, MMM
I did come up with a solution, but it's not efficient. Here is my sed
script (named try.sed
):
/.*;.*;[0-9].*/ { N
N
N
N
N
N
s/\(.*\);.*;.*\n\(.*\);.*;.*\n\(.*\);.*;.*\n\(.*\);.*;.*\n\(.*\);.*;.*\n\(.*\);.*;.*\n\(.*\);.*;.*/\1, \2, \3, \4, \5, \6, \7/gp
}
The command that I run on shell is:
$ sed -nf try.sed stocks
My question is, is there a better way of using sed to get the same result? The script I wrote only works with 7 lines of data. If the data is longer, I need to re-modify my script. I'm not sure how I can make it any better, so I'm here asking for help!
Thanks for any recommendations.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
另一种使用 sed 的方法:
输出:
解释:
One more way using
sed
:Output:
Explanation:
编辑:我已经编辑了我的算法,因为我忽略了页眉和页脚(我认为它们只是为了我们的利益)。
sed 根据其设计,访问输入文件的每一行,然后对与某些规范匹配(或不匹配)的行执行表达式。如果您将脚本定制为一定数量的行,那么您肯定做错了什么!我不会为您编写脚本,因为这是家庭作业,但一种方法的总体思路是编写一个执行以下操作的脚本。将顺序视为脚本中事物应有的顺序。
d
跳过前三行,这会删除模式空间并立即移至下一行。s
(替换)命令将第一个分号 (;
) 之后的所有内容替换为逗号和空格 (", ")。李>H
)。话虽这么说,这只是解决问题的一种方法。
sed
通常提供不同复杂程度的不同方法来完成任务。我用这种方法编写的解决方案有 10 行长。请注意,我不费心抑制打印(使用
-n
)或手动打印(使用p
);默认情况下打印每一行。我的脚本运行如下:Edit: I've edited my algorithm, since I had neglected to consider the header and footer (I thought they were just for our benefit).
sed
, by its design, accesses every line of an input file, and then performs expressions on ones that match some specification (or none). If you're tailoring your script to a certain number of lines, you're definitely doing something wrong! I won't write you a script since this is homework, but the general idea for one way to go about it is to write a script that does the following. Think of the ordering as the order things should be in a script.d
, which deletes the pattern space and immediately moves on to the next line.;
) with a comma-and-space (", ") using thes
(substitute) command.H
).That being said, that's just one way to go about it.
sed
often offers varying ways of varying complexity to accomplish a task. A solution I wrote with this method is 10 lines long.As a note, I don't bother suppressing printing (with
-n
) or manually printing (withp
); each line is printed by default. My script runs like this:此 sed 命令应生成您所需的输出:
或者在 Mac 上:
This sed command should produce your required output:
OR on Mac:
这可能对您有用:
1d
;
分隔,因此让我们重点关注这些行。/;/
;
到行尾的所有内容,然后将其塞入保留空间 (HS){s /;.*//;H}
g
命令用 HS 覆盖它,删除第一个换行符(由H 生成)
命令),用逗号和空格替换所有后续换行符并打印出剩下的内容。${g;s/.//;s/\n/, /g;q}
d
这是一个终端会话,显示了构建sed命令:
This might work for you:
1d
;
's so let's concentrate on those lines./;/
;
to the end of line and then stuff it away in the the hold space (HS){s/;.*//;H}
g
command, delete the first newline (generated by theH
command), replace all subsequent newlines with a comma and a space and print out what's left.${g;s/.//;s/\n/, /g;q}
d
Here's a terminal session showing the incremental refinement of building a sed command: