如何从文本文件中删除包含特定字符串的所有行?

发布于 2024-10-26 01:47:11 字数 35 浏览 7 评论 0原文

如何使用 sed 删除文本文件中包含特定字符串的所有行?

How would I use sed to delete all lines in a text file that contain a specific string?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(21

小伙你站住 2024-11-02 01:47:12

删除该行并将输出打印到标准输出:

sed '/pattern to match/d' ./infile

直接修改文件 – 不适用于 BSD sed:

sed -i '/pattern to match/d' ./infile

相同,但对于 BSD sed(Mac OS X 和 FreeBSD) – 不适用于 GNU sed:

sed -i '' '/pattern to match/d' ./infile

直接修改文件(并创建备份)——与 BSD 和 GNU sed 一起使用:

sed -i.bak '/pattern to match/d' ./infile

To remove the line and print the output to standard out:

sed '/pattern to match/d' ./infile

To directly modify the file – does not work with BSD sed:

sed -i '/pattern to match/d' ./infile

Same, but for BSD sed (Mac OS X and FreeBSD) – does not work with GNU sed:

sed -i '' '/pattern to match/d' ./infile

To directly modify the file (and create a backup) – works with BSD and GNU sed:

sed -i.bak '/pattern to match/d' ./infile
落花浅忆 2024-11-02 01:47:12

除了 sed 之外,还有许多其他方法可以删除具有特定字符串的行:

AWK

awk '!/pattern/' file > temp && mv temp file

Ruby (1.9+)

ruby -i.bak -ne 'print if not /test/' file

Perl

perl -ni.bak -e "print unless /pattern/" file

Shell(bash 3.2 及更高版本)

while read -r line
do
  [[ ! $line =~ pattern ]] && echo "$line"
done <file > o
mv o file

GNU grep

grep -v "pattern" file > temp && mv temp file

当然还有 sed (打印逆过程比实际删除更快):

sed -n '/pattern/!p' file

There are many other ways to delete lines with specific string besides sed:

AWK

awk '!/pattern/' file > temp && mv temp file

Ruby (1.9+)

ruby -i.bak -ne 'print if not /test/' file

Perl

perl -ni.bak -e "print unless /pattern/" file

Shell (bash 3.2 and later)

while read -r line
do
  [[ ! $line =~ pattern ]] && echo "$line"
done <file > o
mv o file

GNU grep

grep -v "pattern" file > temp && mv temp file

And of course sed (printing the inverse is faster than actual deletion):

sed -n '/pattern/!p' file
留蓝 2024-11-02 01:47:12

您可以使用 sed 替换文件中的适当行。但是,它似乎比使用 grep 反转到第二个文件然后将第二个文件移动到原始文件上要慢得多。

例如

sed -i '/pattern/d' filename      

或者

grep -v "pattern" filename > filename2; mv filename2 filename

无论如何,第一个命令在我的机器上花费了 3 倍的时间。

You can use sed to replace lines in place in a file. However, it seems to be much slower than using grep for the inverse into a second file and then moving the second file over the original.

e.g.

sed -i '/pattern/d' filename      

or

grep -v "pattern" filename > filename2; mv filename2 filename

The first command takes 3 times longer on my machine anyway.

箜明 2024-11-02 01:47:12

最简单的方法是使用 GNU sed:

sed --in-place '/some string here/d' yourfile

The easy way to do it, with GNU sed:

sed --in-place '/some string here/d' yourfile
清风夜微凉 2024-11-02 01:47:12

您可以考虑使用 ex (这是基于标准 Unix 命令的编辑器):

ex +g/match/d -cwq file

其中:

  • + 执行给定的 Ex 命令 (man ex),与执行 -c 相同>wq(写入并退出)
  • g/match/d - 用于删除具有给定匹配的行的Ex命令,请参阅:Power of g

上面的示例是符合 POSIX 标准的方法,用于按照此 在 Unix.SE 上发布POSIX ex 的规范。


sed的区别在于:

sed 是一个SED编辑器,而不是文件编辑器。BashFAQ

除非您喜欢不可移植的代码、I/O 开销和其他一些不良副作用。所以基本上一些参数(例如 in-place/-i)是非标准的 FreeBSD 扩展,可能在其他操作系统上不可用。

You may consider using ex (which is a standard Unix command-based editor):

ex +g/match/d -cwq file

where:

  • + executes given Ex command (man ex), same as -c which executes wq (write and quit)
  • g/match/d - Ex command to delete lines with given match, see: Power of g

The above example is a POSIX-compliant method for in-place editing a file as per this post at Unix.SE and POSIX specifications for ex.


The difference with sed is that:

sed is a Stream EDitor, not a file editor.BashFAQ

Unless you enjoy unportable code, I/O overhead and some other bad side effects. So basically some parameters (such as in-place/-i) are non-standard FreeBSD extensions and may not be available on other operating systems.

醉南桥 2024-11-02 01:47:12

我在 Mac 上遇到了这个问题。另外,我需要使用变量替换来做到这一点。

所以我使用:

sed -i '' "/$pattern/d" $file

其中 $file 是需要删除的文件,$pattern code> 是要匹配删除的模式。

我从中选择了 '' 评论

这里需要注意的是在 "/$pattern/d" 中使用了双引号。当我们使用单引号时变量将不起作用。

I was struggling with this on Mac. Plus, I needed to do it using variable replacement.

So I used:

sed -i '' "/$pattern/d" $file

where $file is the file where deletion is needed and $pattern is the pattern to be matched for deletion.

I picked the '' from this comment.

The thing to note here is use of double quotes in "/$pattern/d". Variable won't work when we use single quotes.

望她远 2024-11-02 01:47:12

您还可以使用这个:

 grep -v 'pattern' filename

这里 -v 将仅打印除您的模式之外的其他内容(这意味着反向匹配)。

You can also use this:

 grep -v 'pattern' filename

Here -v will print only other than your pattern (that means invert match).

幻想少年梦 2024-11-02 01:47:12

要使用 grep 获得类似结果,您可以这样做:

echo "$(grep -v "pattern" filename)" >filename

To get a inplace like result with grep you can do this:

echo "$(grep -v "pattern" filename)" >filename
深海夜未眠 2024-11-02 01:47:12

从所有文件中删除匹配的行

grep -rl 'text_to_search' . | xargs sed -i '/text_to_search/d'

Delete lines from all files that match the match

grep -rl 'text_to_search' . | xargs sed -i '/text_to_search/d'
岛歌少女 2024-11-02 01:47:12

我用一个包含大约 345 000 行的文件做了一个小型基准测试。在这种情况下,使用 grep 的方式似乎比 sed 方法快 15 倍左右。

我已经尝试过设置 LC_ALL=C 和不设置 LC_ALL=C,它似乎并没有显着改变时间。搜索字符串 (CDGA_00004.pdbqt.gz.tar) 位于文件中间的某个位置。

以下是命令和时间:

time sed -i "/CDGA_00004.pdbqt.gz.tar/d" /tmp/input.txt

real    0m0.711s
user    0m0.179s
sys     0m0.530s

time perl -ni -e 'print unless /CDGA_00004.pdbqt.gz.tar/' /tmp/input.txt

real    0m0.105s
user    0m0.088s
sys     0m0.016s

time (grep -v CDGA_00004.pdbqt.gz.tar /tmp/input.txt > /tmp/input.tmp; mv /tmp/input.tmp /tmp/input.txt )

real    0m0.046s
user    0m0.014s
sys     0m0.019s

I have made a small benchmark with a file which contains approximately 345 000 lines. The way with grep seems to be around 15 times faster than the sed method in this case.

I have tried both with and without the setting LC_ALL=C, it does not seem change the timings significantly. The search string (CDGA_00004.pdbqt.gz.tar) is somewhere in the middle of the file.

Here are the commands and the timings:

time sed -i "/CDGA_00004.pdbqt.gz.tar/d" /tmp/input.txt

real    0m0.711s
user    0m0.179s
sys     0m0.530s

time perl -ni -e 'print unless /CDGA_00004.pdbqt.gz.tar/' /tmp/input.txt

real    0m0.105s
user    0m0.088s
sys     0m0.016s

time (grep -v CDGA_00004.pdbqt.gz.tar /tmp/input.txt > /tmp/input.tmp; mv /tmp/input.tmp /tmp/input.txt )

real    0m0.046s
user    0m0.014s
sys     0m0.019s
夜清冷一曲。 2024-11-02 01:47:12

SED:

AWK:

GREP:

苏别ゝ 2024-11-02 01:47:12

您还可以删除文件中的一系列行。
例如删除 SQL 文件中的存储过程。

sed '/CREATE PROCEDURE.*/,/END ;/d' sqllines.sql

这将删除 CREATE PROCEDURE 和 END ; 之间的所有行。

我已经用这个 sed 命令清理了许多 sql 文件。

You can also delete a range of lines in a file.
For example to delete stored procedures in a SQL file.

sed '/CREATE PROCEDURE.*/,/END ;/d' sqllines.sql

This will remove all lines between CREATE PROCEDURE and END ;.

I have cleaned up many sql files withe this sed command.

嘿咻 2024-11-02 01:47:12
perl -i    -nle'/regexp/||print' file1 file2 file3
perl -i.bk -nle'/regexp/||print' file1 file2 file3

第一个命令就地编辑文件 (-i)。

第二个命令执行相同的操作,但通过将 .bk 添加到文件名(.bk 可以更改为任何内容)来保留原始文件的副本或备份。

perl -i    -nle'/regexp/||print' file1 file2 file3
perl -i.bk -nle'/regexp/||print' file1 file2 file3

The first command edits the file(s) inplace (-i).

The second command does the same thing but keeps a copy or backup of the original file(s) by adding .bk to the file names (.bk can be changed to anything).

圈圈圆圆圈圈 2024-11-02 01:47:12

我发现大多数答案对我来说没有用,如果你使用 vim,我发现这非常简单明了:

:g//d

来源

I found most of the answers not useful for me, If you use vim I found this very easy and straightforward:

:g/<pattern>/d

Source

阿楠 2024-11-02 01:47:12

echo -e "/thing_to_delete\ndd\033:x\n" | vim 文件要编辑.txt

echo -e "/thing_to_delete\ndd\033:x\n" | vim file_to_edit.txt

耳钉梦 2024-11-02 01:47:12

奇怪的是,接受的答案实际上并没有直接回答问题。问题询问如何使用 sed 替换字符串,但答案似乎预设了如何将任意字符串转换为正则表达式的知识。

许多编程语言库都有执行此类转换的函数,例如

python: re.escape(STRING)
ruby: Regexp.escape(STRING)
java:  Pattern.quote(STRING)

但是如何在命令行上执行此操作?

由于这是一个面向 sed 的问题,一种方法是使用 sed 本身:

sed 's/\([\[/({.*+^$?]\)/\\\1/g'

因此,给定一个任意字符串 $STRING,我们可以编写类似以下内容的内容:

re=$(sed 's/\([\[({.*+^$?]\)/\\\1/g' <<< "$STRING")
sed "/$re/d" FILE

或作为单行:

 sed "/$(sed 's/\([\[/({.*+^$?]\)/\\\1/g' <<< "$STRING")/d" 

其变体如本页其他地方所述。

Curiously enough, the accepted answer does not actually answer the question directly. The question asks about using sed to replace a string, but the answer seems to presuppose knowledge of how to convert an arbitrary string into a regex.

Many programming language libraries have a function to perform such a transformation, e.g.

python: re.escape(STRING)
ruby: Regexp.escape(STRING)
java:  Pattern.quote(STRING)

But how to do it on the command line?

Since this is a sed-oriented question, one approach would be to use sed itself:

sed 's/\([\[/({.*+^$?]\)/\\\1/g'

So given an arbitrary string $STRING we could write something like:

re=$(sed 's/\([\[({.*+^$?]\)/\\\1/g' <<< "$STRING")
sed "/$re/d" FILE

or as a one-liner:

 sed "/$(sed 's/\([\[/({.*+^$?]\)/\\\1/g' <<< "$STRING")/d" 

with variations as described elsewhere on this page.

哭泣的笑容 2024-11-02 01:47:12
cat filename | grep -v "pattern" > filename.1
mv filename.1 filename
cat filename | grep -v "pattern" > filename.1
mv filename.1 filename
摇划花蜜的午后 2024-11-02 01:47:12

万一有人想要精确匹配字符串,您可以使用 grep -w 中的 -w 标志来表示整个字符串。也就是说,例如,如果您想要删除编号为 11 的行,但保留编号为 111 的行:

-bash-4.1$ head file
1
11
111

-bash-4.1$ grep -v "11" file
1

-bash-4.1$ grep -w -v "11" file
1
111

如果您想一次排除多个精确模式,它也可以使用 -f 标志。如果“黑名单”是一个每行都有多个模式的文件,您要从“文件”中删除:

grep -w -v -f blacklist file

Just in case someone wants to do it for exact matches of strings, you can use the -w flag in grep - w for whole. That is, for example if you want to delete the lines that have number 11, but keep the lines with number 111:

-bash-4.1$ head file
1
11
111

-bash-4.1$ grep -v "11" file
1

-bash-4.1$ grep -w -v "11" file
1
111

It also works with the -f flag if you want to exclude several exact patterns at once. If "blacklist" is a file with several patterns on each line that you want to delete from "file":

grep -w -v -f blacklist file
几味少女 2024-11-02 01:47:12

在控制台中显示已处理的文本

cat filename | sed '/text to remove/d' 

将已处理的文本保存到文件中

cat filename | sed '/text to remove/d' > newfile

将已处理的文本信息附加到现有文件

cat filename | sed '/text to remove/d' >> newfile

以处理已处理的文本,在这种情况下,删除已删除的更多行

cat filename | sed '/text to remove/d' | sed '/remove this too/d' | more

| more 将一次显示一页的文本块。

to show the treated text in console

cat filename | sed '/text to remove/d' 

to save treated text into a file

cat filename | sed '/text to remove/d' > newfile

to append treated text info an existing file

cat filename | sed '/text to remove/d' >> newfile

to treat already treated text, in this case remove more lines of what has been removed

cat filename | sed '/text to remove/d' | sed '/remove this too/d' | more

the | more will show text in chunks of one page at a time.

无戏配角 2024-11-02 01:47:12

您可以使用旧的 ed 来编辑文件,其方式与使用的 答案 类似例如。在这种情况下,最大的区别是 ed 通过标准输入获取命令,而不是像 ex 那样作为命令行参数。在脚本中使用它时,适应这种情况的通常方法是使用 printf 将命令通过管道传递给它:

printf "%s\n" "g/pattern/d" w | ed -s filename

或使用heredoc:

ed -s filename <<EOF
g/pattern/d
w
EOF

You can use good old ed to edit a file in a similar fashion to the answer that uses ex. The big difference in this case is that ed takes its commands via standard input, not as command line arguments like ex can. When using it in a script, the usual way to accomodate this is to use printf to pipe commands to it:

printf "%s\n" "g/pattern/d" w | ed -s filename

or with a heredoc:

ed -s filename <<EOF
g/pattern/d
w
EOF
晨曦慕雪 2024-11-02 01:47:12

该解决方案用于对多个文件执行相同的操作。

for file in *.txt; do grep -v "Matching Text" $file > temp_file.txt; mv temp_file.txt $file; done

This solution is for doing the same operation on multiple file.

for file in *.txt; do grep -v "Matching Text" $file > temp_file.txt; mv temp_file.txt $file; done
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文