如何在命令中使用文件并将输出重定向到同一文件而不截断它?

发布于 2024-11-23 18:29:51 字数 190 浏览 2 评论 0 原文

基本上我想从文件中获取输入文本,从该文件中删除一行,然后将输出发送回同一文件。如果这样可以让事情变得更清楚的话。

grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name > file_name

然而,当我这样做时,我最终得到一个空白文件。 有什么想法吗?

Basically I want to take as input text from a file, remove a line from that file, and send the output back to the same file. Something along these lines if that makes it any clearer.

grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name > file_name

however, when I do this I end up with a blank file.
Any thoughts?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(14

我很OK 2024-11-30 18:29:52

使用 sponge 来完成此类任务。它是 moreutils 的一部分。

尝试这个命令:

 grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | sponge file_name

Use sponge for this kind of tasks. Its part of moreutils.

Try this command:

 grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | sponge file_name
童话里做英雄 2024-11-30 18:29:52

您不能这样做,因为 bash 首先处理重定向,然后执行命令。因此,当 grep 查看 file_name 时,它​​已经是空的。不过,您可以使用临时文件。

#!/bin/sh
tmpfile=$(mktemp)
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name > ${tmpfile}
cat ${tmpfile} > file_name
rm -f ${tmpfile}

像这样,考虑使用 mktemp 创建 tmpfile 但请注意,它不是 POSIX。

You cannot do that because bash processes the redirections first, then executes the command. So by the time grep looks at file_name, it is already empty. You can use a temporary file though.

#!/bin/sh
tmpfile=$(mktemp)
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name > ${tmpfile}
cat ${tmpfile} > file_name
rm -f ${tmpfile}

like that, consider using mktemp to create the tmpfile but note that it's not POSIX.

人生百味 2024-11-30 18:29:52

使用 sed 代替:

sed -i '/seg[0-9]\{1,\}\.[0-9]\{1\}/d' file_name

Use sed instead:

sed -i '/seg[0-9]\{1,\}\.[0-9]\{1\}/d' file_name
挖鼻大婶 2024-11-30 18:29:52

尝试这个简单的方法,

grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | tee file_name

这次您的文件将不会是空白的:)并且您的输出也会打印到您的终端。

try this simple one

grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | tee file_name

Your file will not be blank this time :) and your output is also printed to your terminal.

故人爱我别走 2024-11-30 18:29:52

这是很有可能的,您只需确保在写入输出时,将其写入不同的文件即可。这可以通过在打开文件描述符之后但在写入文件之前删除该文件来完成:

exec 3<file ; rm file; COMMAND <&3 >file ;  exec 3>&-

或者逐行,以便更好地理解它:

exec 3<file       # open a file descriptor reading 'file'
rm file           # remove file (but fd3 will still point to the removed file)
COMMAND <&3 >file # run command, with the removed file as input
exec 3>&-         # close the file descriptor

这仍然是一件有风险的事情,因为如果 COMMAND 无法正常运行,您将将丢失文件内容。如果 COMMAND 返回非零退出代码,可以通过恢复文件来缓解这种情况:

exec 3<file ; rm file; COMMAND <&3 >file || cat <&3 >file ; exec 3>&-

我们还可以定义一个 shell 函数以使其更易于使用:

# Usage: replace FILE COMMAND
replace() { exec 3<$1 ; rm $1; ${@:2} <&3 >$1 || cat <&3 >$1 ; exec 3>&- }

示例:

$ echo aaa > test
$ replace test tr a b
$ cat test
bbb

另外,请注意,这将保留原始文件的完整副本(直到第三个文件描述符已关闭)。如果您使用的是 Linux,并且您正在处理的文件太大,无法在磁盘上容纳两次,您可以查看 此脚本,它将通过管道将文件逐块传输到指定的命令,同时取消分配已处理的块。与往常一样,请阅读使用页面中的警告。

This is very much possible, you just have to make sure that by the time you write the output, you're writing it to a different file. This can be done by removing the file after opening a file descriptor to it, but before writing to it:

exec 3<file ; rm file; COMMAND <&3 >file ;  exec 3>&-

Or line by line, to understand it better :

exec 3<file       # open a file descriptor reading 'file'
rm file           # remove file (but fd3 will still point to the removed file)
COMMAND <&3 >file # run command, with the removed file as input
exec 3>&-         # close the file descriptor

It's still a risky thing to do, because if COMMAND fails to run properly, you'll lose the file contents. That can be mitigated by restoring the file if COMMAND returns a non-zero exit code :

exec 3<file ; rm file; COMMAND <&3 >file || cat <&3 >file ; exec 3>&-

We can also define a shell function to make it easier to use :

# Usage: replace FILE COMMAND
replace() { exec 3<$1 ; rm $1; ${@:2} <&3 >$1 || cat <&3 >$1 ; exec 3>&- }

Example :

$ echo aaa > test
$ replace test tr a b
$ cat test
bbb

Also, note that this will keep a full copy of the original file (until the third file descriptor is closed). If you're using Linux, and the file you're processing on is too big to fit twice on the disk, you can check out this script that will pipe the file to the specified command block-by-block while unallocating the already processed blocks. As always, read the warnings in the usage page.

演出会有结束 2024-11-30 18:29:52

您不能对同一文件使用重定向运算符(>>>),因为它具有更高的优先级,并且会在之前创建/截断文件该命令甚至被调用。为了避免这种情况,您应该使用适当的工具,例如 teespongesed -i 或任何其他可以将结果写入文件的工具(例如排序文件-o文件)。

基本上将输入重定向到同一个原始文件是没有意义的,您应该为此使用适当的就地编辑器,例如 Ex 编辑器(Vim 的一部分):

ex '+g/seg[0-9]\{1,\}\.[0-9]\{1\}/d' -scwq file_name

其中:

  • '+cmd'/< code>-c - 运行任何 Ex/Vim 命令
  • g/pattern/d - 使用 全局 (help :g)
  • -s - 静默模式 (man ex)
  • - c wq - 执行 :write:quit 命令

您可以使用 sed 来实现相同的目的(如其他答案中所示) ),但是就地 (-i) 是非标准 FreeBSD 扩展(在 Unix/Linux 之间工作方式可能不同),基本上它是一个 stream editor,而不是文件编辑器。请参阅:Ex 模式有任何实际用途吗?

You can't use redirection operator (> or >>) to the same file, because it has a higher precedence and it will create/truncate the file before the command is even invoked. To avoid that, you should use appropriate tools such as tee, sponge, sed -i or any other tool which can write results to the file (e.g. sort file -o file).

Basically redirecting input to the same original file doesn't make sense and you should use appropriate in-place editors for that, for example Ex editor (part of Vim):

ex '+g/seg[0-9]\{1,\}\.[0-9]\{1\}/d' -scwq file_name

where:

  • '+cmd'/-c - run any Ex/Vim command
  • g/pattern/d - remove lines matching a pattern using global (help :g)
  • -s - silent mode (man ex)
  • -c wq - execute :write and :quit commands

You may use sed to achieve the same (as already shown in other answers), however in-place (-i) is non-standard FreeBSD extension (may work differently between Unix/Linux) and basically it's a stream editor, not a file editor. See: Does Ex mode have any practical use?

人生戏 2024-11-30 18:29:52

由于这个问题是搜索引擎中的最高结果,因此这里有一个基于 https://serverfault.com/a/547331 使用 subshel​​l 而不是 sponge(它通常不是 OS X 等普通安装的一部分):

echo "$(grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name)" > file_name

一般情况是:

echo "$(cat file_name)" > file_name

编辑,上述解决方案有一些注意事项:

  • 应使用 printf '%s' 而不是 echo ,这样包含 -n 的文件就不会导致不良行为。
  • 命令替换会删除尾随换行符(这是 bash 等 shell 的错误/功能),所以我们应该在输出中附加一个后缀字符,例如 x ,并通过 临时变量的参数扩展,如${v%x}
  • 使用临时变量 $v 会破坏当前 shell 环境中任何现有变量 $v 的值,因此我们应该将整个表达式嵌套在括号中以保留以前的值。
  • bash 等 shell 的另一个错误/功能是命令替换会从输出中删除不可打印的字符,例如 null。我通过调用 dd if=/dev/zero bs=1 count=1 >> 验证了这一点file_name 并使用 cat file_name | 以十六进制查看它xxd -p 。但是 echo $(cat file_name) | xxd -p 被删除。因此,这个答案不应该用于二进制文件或任何使用不可打印字符的内容,如 林奇指出

一般解决方案(虽然稍慢,内存消耗更大,并且仍然剥离不可打印的字符)是:

(v=$(cat file_name; printf x); printf '%s' ${v%x} > file_name)

https:// 进行测试askubuntu.com/a/752451

printf "hello\nworld\n" > file_uniquely_named.txt && for ((i=0; i<1000; i++)); do (v=$(cat file_uniquely_named.txt; printf x); printf '%s' ${v%x} > file_uniquely_named.txt); done; cat file_uniquely_named.txt; rm file_uniquely_named.txt

应该打印:

hello
world

而调用 cat file_uniquely_named.txt >当前 shell 中的 file_uniquely_named.txt

printf "hello\nworld\n" > file_uniquely_named.txt && for ((i=0; i<1000; i++)); do cat file_uniquely_named.txt > file_uniquely_named.txt; done; cat file_uniquely_named.txt; rm file_uniquely_named.txt

打印空字符串。

我还没有在大文件(可能超过 2 或 4 GB)上进行过测试。

我从 Hart Simha科斯

Since this question is the top result in search engines, here's a one-liner based on https://serverfault.com/a/547331 that uses a subshell instead of sponge (which often isn't part of a vanilla install like OS X):

echo "$(grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name)" > file_name

The general case is:

echo "$(cat file_name)" > file_name

Edit, the above solution has some caveats:

  • printf '%s' <string> should be used instead of echo <string> so that files containing -n don't cause undesired behavior.
  • Command substitution strips trailing newlines (this is a bug/feature of shells like bash) so we should append a postfix character like x to the output and remove it on the outside via parameter expansion of a temporary variable like ${v%x}.
  • Using a temporary variable $v stomps the value of any existing variable $v in the current shell environment, so we should nest the entire expression in parentheses to preserve the previous value.
  • Another bug/feature of shells like bash is that command substitution strips unprintable characters like null from the output. I verified this by calling dd if=/dev/zero bs=1 count=1 >> file_name and viewing it in hex with cat file_name | xxd -p. But echo $(cat file_name) | xxd -p is stripped. So this answer should not be used on binary files or anything using unprintable characters, as Lynch pointed out.

The general solution (albiet slightly slower, more memory intensive and still stripping unprintable characters) is:

(v=$(cat file_name; printf x); printf '%s' ${v%x} > file_name)

Test from https://askubuntu.com/a/752451:

printf "hello\nworld\n" > file_uniquely_named.txt && for ((i=0; i<1000; i++)); do (v=$(cat file_uniquely_named.txt; printf x); printf '%s' ${v%x} > file_uniquely_named.txt); done; cat file_uniquely_named.txt; rm file_uniquely_named.txt

Should print:

hello
world

Whereas calling cat file_uniquely_named.txt > file_uniquely_named.txt in the current shell:

printf "hello\nworld\n" > file_uniquely_named.txt && for ((i=0; i<1000; i++)); do cat file_uniquely_named.txt > file_uniquely_named.txt; done; cat file_uniquely_named.txt; rm file_uniquely_named.txt

Prints an empty string.

I haven't tested this on large files (probably over 2 or 4 GB).

I have borrowed this answer from Hart Simha and kos.

怪我闹别瞎闹 2024-11-30 18:29:52

一种替代方案 - 将文件的内容设置为变量:

VAR=`cat file_name`; echo "$VAR"|grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' > file_name

One liner alternative - set the content of the file as variable:

VAR=`cat file_name`; echo "$VAR"|grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' > file_name
月下客 2024-11-30 18:29:52

下面的代码将完成与 sponge 相同的事情,而不需要 moreutils

    shuf --output=file --random-source=/dev/zero 

--random-source=/dev/zero 部分技巧 < code>shuf 执行它的操作而不进行任何洗牌,因此它将缓冲您的输入而不改变它。

然而,出于性能原因,使用临时文件确实是最好的。因此,这是我编写的一个函数,它将以通用的方式为您完成此操作:

# Pipes a file into a command, and pipes the output of that command
# back into the same file, ensuring that the file is not truncated.
# Parameters:
#    $1: the file.
#    $2: the command. (With $3... being its arguments.)
# See https://stackoverflow.com/a/55655338/773113

siphon()
{
    local tmp file rc=0
    [ "$#" -ge 2 ] || { echo "Usage: siphon filename [command...]" >&2; return 1; }
    file="$1"; shift
    tmp=$(mktemp -- "$file.XXXXXX") || return
    "$@" <"$file" >"$tmp" || rc=$?
    mv -- "$tmp" "$file" || rc=$(( rc | $? ))
    return "$rc"
}

The following will accomplish the same thing that sponge does, without requiring moreutils:

    shuf --output=file --random-source=/dev/zero 

The --random-source=/dev/zero part tricks shuf into doing its thing without doing any shuffling at all, so it will buffer your input without altering it.

However, it is true that using a temporary file is best, for performance reasons. So, here is a function that I have written that will do that for you in a generalized way:

# Pipes a file into a command, and pipes the output of that command
# back into the same file, ensuring that the file is not truncated.
# Parameters:
#    $1: the file.
#    $2: the command. (With $3... being its arguments.)
# See https://stackoverflow.com/a/55655338/773113

siphon()
{
    local tmp file rc=0
    [ "$#" -ge 2 ] || { echo "Usage: siphon filename [command...]" >&2; return 1; }
    file="$1"; shift
    tmp=$(mktemp -- "$file.XXXXXX") || return
    "$@" <"$file" >"$tmp" || rc=$?
    mv -- "$tmp" "$file" || rc=$(( rc | $? ))
    return "$rc"
}
迷爱 2024-11-30 18:29:52

在我遇到的大多数情况下,这都很好地解决了这个问题:

cat <<< "$(do_stuff_with f)" > f

请注意,虽然 $(…) 去除尾随换行符,<<< 确保< /em> 最后的换行符,所以通常结果是令人惊奇的令人满意。
(如果您想了解更多信息,请在 man bash 中查找“Here Strings”。)

完整示例:

#! /usr/bin/env bash

get_new_content() {
    sed 's/Initial/Final/g' "${1:?}"
}

echo 'Initial content.' > f
cat f

cat <<< "$(get_new_content f)" > f

cat f

这不会截断文件并产生:

Initial content.
Final content.

请注意,为了清楚起见,我在这里使用了一个函数可扩展性,但这不是必需的。

一个常见的用例是 JSON 版本:

echo '{ "a": 12 }' > f
cat f
cat <<< "$(jq '.a = 24' f)" > f
cat f

这会产生:

{ "a": 12 }
{
  "a": 24
}

This does the trick pretty nicely in most of the cases I faced:

cat <<< "$(do_stuff_with f)" > f

Note that while $(…) strips trailing newlines, <<< ensures a final newline, so generally the result is magically satisfying.
(Look for “Here Strings” in man bash if you want to learn more.)

Full example:

#! /usr/bin/env bash

get_new_content() {
    sed 's/Initial/Final/g' "${1:?}"
}

echo 'Initial content.' > f
cat f

cat <<< "$(get_new_content f)" > f

cat f

This does not truncate the file and yields:

Initial content.
Final content.

Note that I used a function here for the sake of clarity and extensibility, but that’s not a requirement.

A common usecase is JSON edition:

echo '{ "a": 12 }' > f
cat f
cat <<< "$(jq '.a = 24' f)" > f
cat f

This yields:

{ "a": 12 }
{
  "a": 24
}
很糊涂小朋友 2024-11-30 18:29:52

还有 ed (作为 sed -i 的替代品):

# cf. http://wiki.bash-hackers.org/howto/edit-ed
printf '%s\n' H 'g/seg[0-9]\{1,\}\.[0-9]\{1\}/d' wq |  ed -s file_name

There's also ed (as an alternative to sed -i):

# cf. http://wiki.bash-hackers.org/howto/edit-ed
printf '%s\n' H 'g/seg[0-9]\{1,\}\.[0-9]\{1\}/d' wq |  ed -s file_name
酷到爆炸 2024-11-30 18:29:52

您可以将 slurp 与 POSIX Awk 结合使用:

!/seg[0-9]\{1,\}\.[0-9]\{1\}/ {
  q = q ? q RS $0 : $0
}
END {
  print q > ARGV[1]
}

示例

You can use slurp with POSIX Awk:

!/seg[0-9]\{1,\}\.[0-9]\{1\}/ {
  q = q ? q RS $0 : $0
}
END {
  print q > ARGV[1]
}

Example

南冥有猫 2024-11-30 18:29:52

我通常使用 tee 程序来执行此操作:

grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | tee file_name

它自行创建和删除临时文件。

I usually use the tee program to do this:

grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | tee file_name

It creates and removes a tempfile by itself.

过度放纵 2024-11-30 18:29:52

试试这个

echo -e "AAA\nBBB\nCCC" > testfile

cat testfile
AAA
BBB
CCC

echo "$(grep -v 'AAA' testfile)" > testfile
cat testfile
BBB
CCC

Try this

echo -e "AAA\nBBB\nCCC" > testfile

cat testfile
AAA
BBB
CCC

echo "$(grep -v 'AAA' testfile)" > testfile
cat testfile
BBB
CCC
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文