将整个段落替换为 Linux 命令行中的另一个段落

发布于 2024-12-13 03:17:41 字数 1463 浏览 0 评论 0原文

我遇到的问题非常简单（或者看起来如此）。我想做的就是用另一个段落替换一段文本（它是标题注释）。这需要在目录层次结构（源代码树）中的不同数量的文件中进行。

要替换的段落必须完整匹配，因为存在类似的文本块。

例如

替换

// ----------
// header
// comment
// to be replaced
// ----------

使用

// **********
// some replacement
// text
// that could have any
// format
// **********

我已经研究过如何使用 sed，据我所知，它可以处理的最多行数是 2（使用 N 命令）。

我的问题是：如何从 linux 命令行执行此操作？

编辑：

获得的解决方案：最佳解决方案是池上的，完全命令行并且最适合我想做的事情。

我的最终解决方案需要一些调整；输入数据包含许多特殊字符，替换数据也是如此。为了解决这个问题，需要对数据进行预处理以插入适当的 \n 和转义字符。最终产品是一个带有 3 个参数的 shell 脚本；包含要搜索的文本的文件、包含要替换的文本的文件以及用于递归解析扩展名为 .cc 和 .h 的文件的文件夹。从这里进行定制相当容易。

脚本：

#!/bin/bash
if [ -z $1 ]; then
    echo 'First parameter is a path to a file that contains the excerpt to be replaced, this must be supplied'
  exit 1
fi

if [ -z $2 ]; then
    echo 'Second parameter is a path to a file contaiing the text to replace with, this must be supplied'
  exit 1
fi

if [ -z $3 ]; then
    echo 'Third parameter is the path to the folder to recursively parse and replace in'
  exit 1
fi

sed 's!\([]()|\*\$\/&[]\)!\\\1!g' $1 > temp.out
sed ':a;N;$!ba;s/\n/\\n/g' temp.out > final.out
searchString=`cat final.out`
sed 's!\([]|\[]\)!\\\1!g' $2 > replace.out
replaceString=`cat replace.out`

find $3 -regex ".*\.\(cc\|h\)" -execdir perl -i -0777pe "s{$searchString}{$replaceString}" {} +

原文

The problem I have is pretty straightforward (or so it seems). All I want to do is replace a paragraph of text (it's a header comment) with another paragraph. This will need to happen across a diverse number of files in a directory hierarchy (source code tree).

The paragraph to be replaced must be matched in it's entirety as there are similar text blocks in existence.

e.g.

To Replace

// ----------
// header
// comment
// to be replaced
// ----------

With

// **********
// some replacement
// text
// that could have any
// format
// **********

I have looked at using sed and from what I can tell the most number of lines that it can work on is 2 (with the N command).

My question is: what is the way to do this from the linux command line?

EDIT:

Solution obtained: Best solution was Ikegami's, fully command line and best fit for what I wanted to do.

My final solution required some tweaking; the input data contained a lot of special characters as did the replace data. To deal with this the data needs to be pre processed to insert appropriate \n's and escape characters. The end product is a shell script that takes 3 arguments; File containing text to search for, File containing text to replace with and a folder to recursively parse for files with .cc and .h extension. It's fairly easy to customise from here.

SCRIPT:

#!/bin/bash
if [ -z $1 ]; then
    echo 'First parameter is a path to a file that contains the excerpt to be replaced, this must be supplied'
  exit 1
fi

if [ -z $2 ]; then
    echo 'Second parameter is a path to a file contaiing the text to replace with, this must be supplied'
  exit 1
fi

if [ -z $3 ]; then
    echo 'Third parameter is the path to the folder to recursively parse and replace in'
  exit 1
fi

sed 's!\([]()|\*\$\/&[]\)!\\\1!g' $1 > temp.out
sed ':a;N;$!ba;s/\n/\\n/g' temp.out > final.out
searchString=`cat final.out`
sed 's!\([]|\[]\)!\\\1!g' $2 > replace.out
replaceString=`cat replace.out`

find $3 -regex ".*\.\(cc\|h\)" -execdir perl -i -0777pe "s{$searchString}{$replaceString}" {} +

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

温柔少女心 2024-12-20 03:17:41

find -name '*.pm' -exec perl -i~ -0777pe'
    s{// ----------\n// header\n// comment\n// to be replaced\n// ----------\n}
     {// **********\n// some replacement\n// text\n// that could have any\n// format\n// **********\n};
' {} +

find -name '*.pm' -exec perl -i~ -0777pe'
    s{// ----------\n// header\n// comment\n// to be replaced\n// ----------\n}
     {// **********\n// some replacement\n// text\n// that could have any\n// format\n// **********\n};
' {} +

回复收藏 0 原文

谁许谁一生繁华 2024-12-20 03:17:41

使用perl：

#!/usr/bin/env perl
# script.pl
use strict;
use warnings;
use Inline::Files;

my $lines = join '', <STDIN>; # read stdin
my $repl = join '', <REPL>; # read replacement
my $src = join '', <SRC>; # read source
chomp $repl; # remove trailing \n from $repl
chomp $src; # id. for $src
$lines =~ s@$src@$repl@gm; # global multiline replace 
print $lines; # print output

__SRC__
// ----------
// header
// comment
// to be replaced
// ----------
__REPL__
// **********
// some replacement
// text
// that could have any
// format
// **********

用法： ./script.pl <你的文件.cpp >输出.cpp

要求： Inline::Files（从 cpan 安装）

测试环境： perl v5.12.4、Linux _ 3.0.0-12-generic #20-Ubuntu SMP 10 月 7 日星期五 14:56:25 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

Using perl:

#!/usr/bin/env perl
# script.pl
use strict;
use warnings;
use Inline::Files;

my $lines = join '', <STDIN>; # read stdin
my $repl = join '', <REPL>; # read replacement
my $src = join '', <SRC>; # read source
chomp $repl; # remove trailing \n from $repl
chomp $src; # id. for $src
$lines =~ s@$src@$repl@gm; # global multiline replace 
print $lines; # print output

__SRC__
// ----------
// header
// comment
// to be replaced
// ----------
__REPL__
// **********
// some replacement
// text
// that could have any
// format
// **********

Usage: ./script.pl < yourfile.cpp > output.cpp

Requirements: Inline::Files (install from cpan)

Tested on: perl v5.12.4, Linux _ 3.0.0-12-generic #20-Ubuntu SMP Fri Oct 7 14:56:25 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

回复收藏 0 原文

山田美奈子 2024-12-20 03:17:41

这可能有效：

# cat <<! | sed ':a;N;s/this\nand\nthis\n/something\nelse\n/;ba'
> a
> b
> c
> this
> and
> this
> d
> e
> this
> not
> this
> f
> g
> !
a
b
c 
something
else
d
e
this
not
this 
f
g

技巧是使用 N 和循环 :a;...;ba 将所有内容放入模式空间中
这可能更有效：

sed '1{h;d};H;$!d;x;s/this\nand\nthis\n/something\nelse\n/g;p;d'

更通用的解决方案可能使用文件来匹配和替换数据，如下所示：

match=$(sed ':a;N;${s/\n/\\n/g};ba;' match_file)
substitute=$(sed ':a;N;${s/\n/\\n/g};ba;' substitute_file)
sed '1{h;d};H;$!d;x;s/'"$match"'/'"$substitute"'/g;p;d' source_file

另一种方式（可能效率较低）但看起来更干净：

sed -s '$s/$/\n@@@/' match_file substitute_file | 
sed -r '1{h;d};H;${x;:a;s/^((.*)@@@\n(.*)@@@\n(.*))\2/\1\3/;ta;s/(.*@@@\n){2}//;p};d' - source_file

最后一种使用 GNU sed --separate 选项将每个文件视为一个单独的实体。第二个 sed 命令使用循环进行替换，以避免 .* 贪婪。

This might work:

# cat <<! | sed ':a;N;s/this\nand\nthis\n/something\nelse\n/;ba'
> a
> b
> c
> this
> and
> this
> d
> e
> this
> not
> this
> f
> g
> !
a
b
c 
something
else
d
e
this
not
this 
f
g

The trick is to slurp everything into the pattern space using the N and the loop :a;...;ba
This is probably more efficient:

sed '1{h;d};H;$!d;x;s/this\nand\nthis\n/something\nelse\n/g;p;d'

A more general purpose solution may use files for match and substitute data like so:

match=$(sed ':a;N;${s/\n/\\n/g};ba;' match_file)
substitute=$(sed ':a;N;${s/\n/\\n/g};ba;' substitute_file)
sed '1{h;d};H;$!d;x;s/'"$match"'/'"$substitute"'/g;p;d' source_file

Another way (probably less efficient) but cleaner looking:

sed -s '$s/$/\n@@@/' match_file substitute_file | 
sed -r '1{h;d};H;${x;:a;s/^((.*)@@@\n(.*)@@@\n(.*))\2/\1\3/;ta;s/(.*@@@\n){2}//;p};d' - source_file

The last uses the GNU sed --separate option to treat each file as a separate entity. The second sed command uses a loop for the substitute to obviate .* greediness.

回复收藏 0 原文

半城柳色半声笛 2024-12-20 03:17:41

只要标头注释被唯一分隔（即没有其他标头注释以 // ---------- 开头），并且替换文本是常量，以下 awk 脚本应该做你需要的事情：

BEGIN { normal = 1 }

/\/\/ ----------/ {
    if (normal) {
        normal = 0;
        print "// **********";
        print "// some replacement";
        print "// text";
        print "// that could have any";
        print "// format";
        print "// **********";
    } else {
        normal = 1;
        next;
    }
}

{
    if (normal) print;
}

这会打印它看到的所有内容，直到遇到段落分隔符。当它看到第一个段落时，它会打印出替换段落。在看到第二段分隔符之前，它不会打印任何内容。当它看到第二个段落分隔符时，它将再次开始正常打印下一行。

虽然从技术上讲您可以从命令行执行此操作，但您可能会遇到棘手的 shell 引用问题，尤其是当替换文本包含单引号时。将脚本放入文件中可能会更容易。只需将 #!/usr/bin/awk -f （或 awk 返回的任何路径）放在顶部即可。

编辑

要在 awk 中匹配多行，您需要使用 getline。也许是这样的：

/\/\/ ----------/ {
    lines[0] = "// header";
    lines[1] = "// comment";
    lines[2] = "// to be replaced";
    lines[3] = "// ----------";

    linesRead = $0 "\n";
    for (i = 0; i < 4; i++) {
         getline line;
         linesRead = linesRead line;
         if (line != lines[i]) {
             print linesRead; # print partial matches
             next;
         }
    }

    # print the replacement paragraph here
    next;
}

As long as the header comments are delimited uniquely (i.e., no other header comment starts with // ----------), and the replacement text is constant, the following awk script should do what you need:

BEGIN { normal = 1 }

/\/\/ ----------/ {
    if (normal) {
        normal = 0;
        print "// **********";
        print "// some replacement";
        print "// text";
        print "// that could have any";
        print "// format";
        print "// **********";
    } else {
        normal = 1;
        next;
    }
}

{
    if (normal) print;
}

This prints everything it sees until it runs into the paragraph delimiter. When it sees the first one, it prints out the replacement paragraph. Until it sees the 2nd paragraph delimiter, it will print nothing. When it sees the 2nd paragraph delimiter, it will start printing lines normally again with the next line.

While you can technically do this from the command line, you may run into tricky shell quoting issues, especially if the replacement text has any single quotes. It may be easier to put the script in a file. Just put #!/usr/bin/awk -f (or whatever path which awk returns) at the top.

EDIT

To match multiple lines in awk, you'll need to use getline. Perhaps something like this:

/\/\/ ----------/ {
    lines[0] = "// header";
    lines[1] = "// comment";
    lines[2] = "// to be replaced";
    lines[3] = "// ----------";

    linesRead = $0 "\n";
    for (i = 0; i < 4; i++) {
         getline line;
         linesRead = linesRead line;
         if (line != lines[i]) {
             print linesRead; # print partial matches
             next;
         }
    }

    # print the replacement paragraph here
    next;
}

回复收藏 0 原文

~没有更多了~