在shell脚本中使用sed命令进行子字符串并替换需要的位置

发布于 2025-01-27 09:48:00 字数 405 浏览 2 评论 0原文

我正在在文本文件上处理数据,我找不到SED的方法来在固定位置选择子字符串并替换。

这就是我所拥有的:

X|001200000000000000000098765432|1234567890|TQ

这就是我所需要的:

‘X’,’00000098765432’,’1234567890’,’TQ’

以下代码在SED中给出了我需要的子字符串(00000098765432),但不覆盖位置,

echo “ X|001200000000000000000098765432|1234567890|TQ” | sed “s/
*//g;s/|/‘,’/g;s/^/‘/;s/$/‘/“

您可以帮助我吗?

I’m dealing data on text file and I can’t find a way with sed to select a substring at a fixed position and replace it.

This is what I have:

X|001200000000000000000098765432|1234567890|TQ

This is what I need:

‘X’,’00000098765432’,’1234567890’,’TQ’

The following code in sed gives the substring I need (00000098765432) but not overwrites position to need

echo “ X|001200000000000000000098765432|1234567890|TQ” | sed “s/
*//g;s/|/‘,’/g;s/^/‘/;s/$/‘/“

Could you help me?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

你是我的挚爱i 2025-02-03 09:48:00

我将使用SED,而是使用awk

echo "X|001200000000000000000098765432|1234567890|TQ" | awk 'BEGIN {FS="|";OFS=","} {print $1,substr($2,17,14),$3,$4}'

给出输出:

X,00000098765432,1234567890,TQ

这是其工作原理:

fs = field separator(在输入中)

ofs =输出字段分隔仪(您希望输出的方式被划定的方式)

>开始 - >将其视为设置配置的地方。它只运行一次。因此,您是在说希望输出被划界,并将输入界定为界限。

substr($ 2,17,14) - >以$ 2的价格(即第二个字段 - awk从1开始计数,然后在上面应用子字符串。17表示开始角色位置,14表示从该位置开始的字符数量),

我认为这比这比可读性更高且可维护您拥有的SED版本。

Rather than sed, I would use awk for this.

echo "X|001200000000000000000098765432|1234567890|TQ" | awk 'BEGIN {FS="|";OFS=","} {print $1,substr($2,17,14),$3,$4}'

Gives output:

X,00000098765432,1234567890,TQ

Here is how it works:

FS = Field separator (in the input)

OFS = Output field separator (the way you want output to be delimited)

BEGIN -> think of it as the place where configurations are set. It runs only one time. So you are saying you want output to be comma delimited and input is pipe delimited.

substr($2,17,14) -> Take $2 (i.e. second field - awk begins counting from 1 - and then apply substring on it. 17 means the beginning character position and 14 means the number of characters from that position onwards)

In my opinion, this is much more readable and maintainable than sed version you have.

小姐丶请自重 2025-02-03 09:48:00

如果您想将报价放入,我仍然会使用awk

$: awk -F'|' 'BEGIN{q="\047"} {print  q $1 q","q substr($2,17,14) q","q $3 q","q $4 q"\n"}' <<< "X|001200000000000000000098765432|1234567890|TQ"
'X','00000098765432','1234567890','TQ'

如果您只想使用sed,请注意,您要删除16个字符,但实际上您只删除14个字符。

$: sed -E "s/^(.)[|].{14}([^|]+)[|]([^|]+)[|]([^|]+)/'\1','\2','\3','\4'/" <<< "X|0012000000000000000098765432|1234567890|TQ"
'X','00000098765432','1234567890','TQ'

If you want to put the quotes in, I'd still use awk.

$: awk -F'|' 'BEGIN{q="\047"} {print  q $1 q","q substr($2,17,14) q","q $3 q","q $4 q"\n"}' <<< "X|001200000000000000000098765432|1234567890|TQ"
'X','00000098765432','1234567890','TQ'

If you just want to use sed, note that you say above you want to remove 16 characters, but you are actually only removing 14.

$: sed -E "s/^(.)[|].{14}([^|]+)[|]([^|]+)[|]([^|]+)/'\1','\2','\3','\4'/" <<< "X|0012000000000000000098765432|1234567890|TQ"
'X','00000098765432','1234567890','TQ'
下壹個目標 2025-02-03 09:48:00

使用sed

$ sed "s/|\(0[0-9]\{15\}\)\?/','/g;s/^\|$/'/g" input_file
'X','00000098765432','1234567890','TQ'

Using sed

$ sed "s/|\(0[0-9]\{15\}\)\?/','/g;s/^\|$/'/g" input_file
'X','00000098765432','1234567890','TQ'
何止钟意 2025-02-03 09:48:00

使用任何Posix Awk:

$ echo 'X|001200000000000000000098765432|1234567890|TQ' |
awk -F'|' -v OFS="','" -v q="'" '{sub(/.{16}/,"",$2); print q $0 q}'
'X','00000098765432','1234567890','TQ'

Using any POSIX awk:

$ echo 'X|001200000000000000000098765432|1234567890|TQ' |
awk -F'|' -v OFS="','" -v q="'" '{sub(/.{16}/,"",$2); print q $0 q}'
'X','00000098765432','1234567890','TQ'
阿楠 2025-02-03 09:48:00

不像我希望的那样优雅,但它可以完成工作:

'X','00000098765432','1234567890','TQ'

    # gawk profile, created Mon May  9 21:19:17 2022
    # BEGIN rule(s)

    'BEGIN {
     1     _ = sprintf("%*s", (__ = +2)^++__+--__*++__,__--)
     1            gsub(".", "[0-9]", _)
     1             sub("$",     "$", _)
     1    FS = "[|]"
     1   OFS = "\47,\47"
    }

    # Rule(s)

     1     (NF *= NF == __*__) * sub(_,  "|&",   $__) * \
        sub("^.*[|]", "", $__) * sub(".+", "\47&\47")    }'

经过测试和确认在gnu gawk 5.1.1mawk 1.3.4mawk 1.9 .9.6macOSX NAWK

- 4chan teller

not as elegant as I hoped for, but it gets the job done :

'X','00000098765432','1234567890','TQ'

    # gawk profile, created Mon May  9 21:19:17 2022
    # BEGIN rule(s)

    'BEGIN {
     1     _ = sprintf("%*s", (__ = +2)^++__+--__*++__,__--)
     1            gsub(".", "[0-9]", _)
     1             sub("
quot;,     "
quot;, _)
     1    FS = "[|]"
     1   OFS = "\47,\47"
    }

    # Rule(s)

     1     (NF *= NF == __*__) * sub(_,  "|&",   $__) * \
        sub("^.*[|]", "", $__) * sub(".+", "\47&\47")    }'

Tested and confirmed working on gnu gawk 5.1.1, mawk 1.3.4, mawk 1.9.9.6, and macosx nawk

The 4Chan Teller

离笑几人歌 2025-02-03 09:48:00
awk -v del1="\047" \
    -v del2="," \
    -v start="3" \
    -v len="17" \
    '{
         gsub(substr($0,start+1,len),"");
         gsub(/[\|]/,del1 del2 del1);
         print del1$0del1
    }' input_file

'X',00000098765432','1234567890','TQ'
awk -v del1="\047" \
    -v del2="," \
    -v start="3" \
    -v len="17" \
    '{
         gsub(substr($0,start+1,len),"");
         gsub(/[\|]/,del1 del2 del1);
         print del1$0del1
    }' input_file

'X',00000098765432','1234567890','TQ'
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文