SED用任何字母顺序删除线路

发布于 2025-01-31 02:17:39 字数 144 浏览 5 评论 0 原文

我试图删除所有具有字母顺序的3个字符的线条，并使用SED删除任何简单的方法，而不是一堆模式线

sed -i '/abc/d 
       /bcd/d
       ....
      /xyz/d' file.txt

原文

im trying to remove all lines that have any 3 characters in alphabetical order with sed is there an easy way to do this instead of a bunch of pattern lines

sed -i '/abc/d 
       /bcd/d
       ....
      /xyz/d' file.txt

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

狼性发作 2025-02-07 02:17:40

这可能对您有用（gnu sed）：

sed -En '1{x;s/^/abcdefghijklmnopqrstuvwxyz/;x};G;/(...).*\n.*\1/!P' file

在第一行，在保留空间中引入一个字母字母。

在每行，附加字母内并使用三个字符回引用，将其比较字母。

如果有匹配，请删除该行，否则，仅打印第一行。

nb使用 -n 关闭隐式打印，因此只有在匹配失败时才打印出来。

This might work for you (GNU sed):

sed -En '1{x;s/^/abcdefghijklmnopqrstuvwxyz/;x};G;/(...).*\n.*\1/!P' file

On the first line, introduce a literal alphabet in the hold space.

On each line, append the alphabet and using a three character back reference, compare it the the alphabet.

If there is a match, delete the line, otherwise, print the first line only.

N.B. The use of the -n turns off implicit printing and thus only when a match fails is the line printed.

回复收藏 0 原文

绅士风度i 2025-02-07 02:17:39

使用您尝试的代码，请尝试以下 awk 代码，其中我们没有编写连续字母的所有组合。 IMHO awk 将在此处 sed 在此处要高得多。

awk '
BEGIN{
  FS=""
  num=split("a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z",arr1,",")
  for(i=1;i<=num;i++){ letters[arr1[i]]=i }
}
{
  for(i=1;i<=NF;i++){
    if(($i in letters) && ($(i+1) in letters) && ($(i+2) in letters)\
    && (letters[$i]+1==letters[$(i+1)]) && (letters[$i]+2==letters[$(i+2)])\
    && (letters[$(i+1)]+1==letters[$(i+2)])){
       print $i $(i+1) $(i+2)
    }
  }
}
'  Input_file

说明： 整个 awk 程序的简单而详细的说明将是：

的说明开始的块awk 程序：

创建field saparator（ fs ）作为 awk 中的所有行，以便每个行可以比较角色，以找出连续3个字母出现。
然后使用 split awk 创建一个名为 arr1 的数组，其中将所有字母（小字母）与的定界符，< /代码>在这里。
然后启动循环的直到 num 的值（也可以写为26），因为始终固定字母的数量），在其中创建一个名为 letters 具有索引为字母，其值将是其位置值（它们发生的数字，例如：对于 a ，它将为 1 ）。

awk 程序的主要块的说明：

运行循环从第一个字段到 NF 基本上是当前行的所有字段。
然后检查那里的条件（基本上检查当前字段和下一个字段是否在字母数组中出现，并检查其序列是否连续）。
如果满足所述所有条件，则打印电流和接下来的两个字段（基本上将打印3个字母）。

With your attempted code, please try following awk code, where we are not writing all combinations of continuous alphabets. IMHO awk will be much efficient then sed here.

awk '
BEGIN{
  FS=""
  num=split("a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z",arr1,",")
  for(i=1;i<=num;i++){ letters[arr1[i]]=i }
}
{
  for(i=1;i<=NF;i++){
    if(($i in letters) && ($(i+1) in letters) && ($(i+2) in letters)\
    && (letters[$i]+1==letters[$(i+1)]) && (letters[$i]+2==letters[$(i+2)])\
    && (letters[$(i+1)]+1==letters[$(i+2)])){
       print $i $(i+1) $(i+2)
    }
  }
}
'  Input_file

Explanation: Simple and detailed explanation for whole awk program would be:

Explanation of BEGIN block of awk program:

Creating field separator(FS) as NULL for all lines in awk so that each character could be compared to find out 3 consecutive occurrences of letters.
Then using split function of awk creating an array named arr1 where splitting all alphabets(small letters) into it with delimiter of , here.
Then starting a for loop till value of num(could be written as 26 also since number of alphabets are always fixed), where creating an array named letters which has index as alphabets and its value will be their place value(their number on which they occur, eg: for a it will be 1).

Explanation of main block of awk program:

Running a for loop from 1st field to till NF all fields of current line basically.
Then checking conditions there(basically checking if current field and next 2 fields are coming in letters array or not AND checking if their sequence is continuous or not).
If all conditions mentioned are met then printing current and next 2 fields(which will basically print 3 letters).