使用 awk / sed 解析这个特定字符串的最佳方法?

发布于 2024-10-15 04:22:22 字数 582 浏览 1 评论 0原文

我需要从文件中获取特定版本字符串(称为 version.lst),并使用它来比较 shell 脚本中的另一个版本字符串。例如,该文件包含如下所示的行:

V1.000 -- build date and other info here -- APP1
V1.000 -- build date and other info here -- APP2
V1.500 -- build date and other info here -- APP3

.. 等等。假设我正在尝试从 APP1 获取第一个版本(在本例中为 V1.000)。显然,版本可以改变,我希望这是动态的。我现在所拥有的有效:

var = `cat version.lst | grep " -- APP1" | grep -Eo V[0-9].[0-9]{3}`

管道到 grep 将获取包含 APP1 的行,第二个管道到 grep 将获取版本字符串。但是,我听说 grep 不是执行此操作的方法,因此我想学习使用 awk 或 sed 的最佳方法。有什么想法吗?我对两者都是新手,还没有找到足够简单的教程来学习它的语法。他们支持egrep吗?谢谢!

I need to get a particular version string from a file (call it version.lst) and use it to compare another in a shell script. For example sake, the file contains lines that look like this:

V1.000 -- build date and other info here -- APP1
V1.000 -- build date and other info here -- APP2
V1.500 -- build date and other info here -- APP3

.. and so on. Let's say I am trying to grab the first version (in this case, V1.000) from APP1. Obviously, the versions can change and I want this to be dynamic. What I have right now works:

var = `cat version.lst | grep " -- APP1" | grep -Eo V[0-9].[0-9]{3}`

Pipe to grep will get the line containing APP1 and the second pipe to grep will get the version string. However, I hear grep is not the way to do this so I'd like to learn the best way using awk or sed. Any ideas? I am new to both and haven't found a tutorial easy enough to learn the syntax of it. Do they support egrep? Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

司马昭之心 2024-10-22 04:22:22

试试这个来获取完整版本:

#!/bin/sh
app=APP1
var=$(awk -v "app=$app" '$NF == app {print $1}' version.lst)

或者只获取主版本号,最后一行可以是:

var=$(awk -v "app=$app" '$NF == app {split($1,a,"."); print a[1]}' version.lst)

使用 sed 获取完整版本:

var=$(sed -n "/ $app\$/s/^\([^ ]*\).*/\1/p" version.lst)

或者只获取主版本号:

var=$(sed -n "/ $app\$/s/^\([^.]*\).*/\1/p" version.lst)

说明:

第二个 AWK 命令:

  • -v "app=$app" - 将 AWK 变量设置为等于 shell 变量
  • $NF == app - 如果最后一个字段等于变量的内容(NF 是字段的编号,因此 $NF 是第 NF 个字段的内容)
  • {split($1,a,". ") - 然后在点处分割第一个字段
  • print a[1] - 并打印分割结果的第一部分

sed 命令:

  • -n - 不打印任何输出,除非定向到
  • "/ $app\$/ - 对于以 (\$ 结尾的任何行) shell 变量 $app 的内容(不是使用双引号来允许变量展开,最好转义第二个美元符号)
  • s/^\( [^ ]*\).*/\1/p" - 从行的开头 (^) 开始,捕获 \(\)由任意数量(零个或多个 *)的非空格 ([^ ])(或第二个版本中的非点)组成的字符序列,并且匹配但不要捕获该行上的所有其余字符 (.*),用捕获的字符串(版本号)替换匹配的文本(在本例中为整行)(< code>\1 引用第一个(在本例中是唯一的)捕获组,并打印它 (p)

Try this to get the complete version:

#!/bin/sh
app=APP1
var=$(awk -v "app=$app" '$NF == app {print $1}' version.lst)

or to get only the major version number, the last line could be:

var=$(awk -v "app=$app" '$NF == app {split($1,a,"."); print a[1]}' version.lst)

Using sed to get the complete version:

var=$(sed -n "/ $app\$/s/^\([^ ]*\).*/\1/p" version.lst)

or this to get only the major version number:

var=$(sed -n "/ $app\$/s/^\([^.]*\).*/\1/p" version.lst)

Explanations:

The second AWK command:

  • -v "app=$app" - set an AWK variable equal to a shell variable
  • $NF == app - if the last field is equal to the contents of the variable (NF is the number of field, so $NF is the contents of the NFth field)
  • {split($1,a,".") - then split the first field at the dot
  • print a[1] - and print the first part of the result of the split

The sed commands:

  • -n - don't print any output unless directed to
  • "/ $app\$/ - for any line that ends with (\$) the contents of the shell variable $app (not that double quotes are used to allow the variable to be expanded and it's a good idea to escape the second dollar sign)
  • s/^\([^ ]*\).*/\1/p" - starting at the beginning of the line (^), capture \(\) the sequence of characters that consists of non-spaces ([^ ]) (or non-dots in the second version) of any number (zero or more *) and match but don't capture all the rest of the characters on the line (.*), replace the matched text (the whole line in this case) with the string that was captured (the version number) (\1 refers to the first (only, in this case) capture group, and print it (p)
昨迟人 2024-10-22 04:22:22

如果我理解正确的话: egrep "APP1$" version.lst | awk '{print $1}'

If I understood correctly: egrep "APP1$" version.lst | awk '{print $1}'

鸠书 2024-10-22 04:22:22
$ awk '/^V1\.00.* APP1$/{print $NF}' version.lst
APP1

该正则表达式匹配以“V1.00”开头、后跟任意数量的任何其他字符、以“APP1”结尾的行。中间的反斜杠可能非常重要——它只匹配“.”,因此它排除了可能以“V1a00”开头的行(可能是损坏的)。 “APP1”之前的空格不包括“APP2_APP1”之类的内容。

“NF”是一个自动生成的变量,包含输入行中的字段数。它也是最后一个字段的编号,这恰好是您感兴趣的字段。

有几种方法可以删除“V1”。这是一种方法,尽管你和我可能谈论的不是完全相同的事情。

$ awk '/^V1\.00.* APP1$/{print substr($1, 1, index($1, ".") - 1), $NF}' version.lst
V1 APP1
$ awk '/^V1\.00.* APP1$/{print $NF}' version.lst
APP1

That regular expression matches lines that start with "V1.00", followed by any number of any other characters, ending with " APP1". The backslash in the middle there might be really important--it matches only ".", and so it excludes (probably corrupt) lines that might begin with, say, "V1a00". The space before "APP1" excludes things like "APP2_APP1".

"NF" is an automatically generated variable that contains the number of field in the input line. It's also the number of the last field, which happens to be the one you're interested in.

There are a couple of ways to prune off the "V1". Here's one way, although you and I might not be talking about quite the same thing.

$ awk '/^V1\.00.* APP1$/{print substr($1, 1, index($1, ".") - 1), $NF}' version.lst
V1 APP1
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文