使用 awk 忽略转义分隔符（逗号）？

发布于 2024-08-05 09:26:27 字数 146 浏览 1 评论 0原文

如果我有一个带有转义逗号的字符串，如下所示：

a,b,{c\,d\,e},f,g

我如何使用 awk 将其解析为以下项目？

a
b
{c\,d\,e}
f
g

原文

If I had a string with escaped commas like so:

a,b,{c\,d\,e},f,g

How might I use awk to parse that into the following items?

a
b
{c\,d\,e}
f
g

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

执手闯天涯 2024-08-12 09:26:27

{
   split($0, a, /,/)
   j=1
   for(i=1; i<=length(a); ++i) {
      if(match(b[j], /\\$/)) {
         b[j]=b[j] "," a[i]
      } else {
         b[++j] = a[i]
      }
   }
   for(k=2; k<=length(b); ++k) {
      print b[k]
   }
}

拆分为数组 a，使用 ',' 作为分隔符
从 a 构建数组 b，合并以结尾的行'\'
打印数组 b （注意：从 2 开始，因为第一项为空）

此解决方案假定（目前）','是唯一用 '\' 转义的字符 - 也就是说，不需要处理输入中的任何 \\，也不需要处理诸如 < 之类的奇怪组合代码>\\\,\\,\\\\,,\,。

{
   split($0, a, /,/)
   j=1
   for(i=1; i<=length(a); ++i) {
      if(match(b[j], /\\$/)) {
         b[j]=b[j] "," a[i]
      } else {
         b[++j] = a[i]
      }
   }
   for(k=2; k<=length(b); ++k) {
      print b[k]
   }
}

Split into array a, using ',' as delimiter
Build array b from a, merging lines that end in '\'
Print array b (Note: Starts at 2 since first item is blank)

This solution presumes (for now) that ',' is the only character that is ever escaped with '\'--that is, there is no need to handle any \\ in the input, nor weird combinations such as \\\,\\,\\\\,,\,.

回复收藏 0 原文

握住你手 2024-08-12 09:26:27

{
  gsub("\\\\,", "!Q!")
  n = split($0, a, ",")
  for (i = 1; i <= n; ++i) {
    gsub("!Q!", "\\,", a[i])
    print a[i]
  }
}

{
  gsub("\\\\,", "!Q!")
  n = split($0, a, ",")
  for (i = 1; i <= n; ++i) {
    gsub("!Q!", "\\,", a[i])
    print a[i]
  }
}

回复收藏 0 原文

烟花肆意 2024-08-12 09:26:27

我不认为 awk 对这样的事情有任何内置支持。这里有一个解决方案，它不像 DigitalRoss 的那么短，但应该不会有意外碰到你的编弦的危险（！Q！）。由于它使用 if 进行测试，因此您还可以扩展它以小心字符串末尾是否确实有 \\,，这应该是转义斜杠, 不是逗号。

BEGIN {
    FS = ","
}

{
    curfield=1
    for (i=1; i<=NF; i++) {
        if (substr($i,length($i)) == "\\") {
            fields[curfield] = fields[curfield] substr($i,1,length($i)-1) FS
        } else {
            fields[curfield] = fields[curfield] $i
            curfield++
        }
    }
    nf = curfield - 1
    for (i=1; i<=nf; i++) {
        printf("%d: %s   ",i,fields[i])
    }
    printf("\n")
}

I don't think awk has any built-in support for something like this. Here's a solution that's not nearly as short as DigitalRoss's, but should have no danger of ever accidentally hitting your made-up string (!Q!). Since it tests with an if, you could also extend it to be careful about whether you actually have \\, at the end of your string, which should be an escaped slash, not comma.

BEGIN {
    FS = ","
}

{
    curfield=1
    for (i=1; i<=NF; i++) {
        if (substr($i,length($i)) == "\\") {
            fields[curfield] = fields[curfield] substr($i,1,length($i)-1) FS
        } else {
            fields[curfield] = fields[curfield] $i
            curfield++
        }
    }
    nf = curfield - 1
    for (i=1; i<=nf; i++) {
        printf("%d: %s   ",i,fields[i])
    }
    printf("\n")
}

回复收藏 0 原文

~没有更多了~