如何比较 python 中的 Rpm 版本

发布于 2024-09-09 04:14:35 字数 866 浏览 3 评论 0原文

我试图找出如何比较 RPMS(当前安装)和(在本地存储库中可用)的 2 个列表,并查看哪些 RPMS 已过期。我一直在修改正则表达式,但是 RPMS 有很多不同的命名标准,我无法找到一个好的列表来使用。我的驱动器上没有实际的 RPMS,所以我无法执行 rpm -qif。

pattern1 = re.compile(r'^([a-zA-Z0-9_\-\+]*)-([a-zA-Z0-9_\.]*)-([a-zA-Z0-9_\.]*)\.(.*)')
for rpm in listOfRpms:
     packageInfo = pattern1.search(rpm[0]).groups()
     print packageInfo

这适用于绝大多数,但不是全部(2300 / 2400),

  yum-metadata-parser-1.1.2-2.el5
('yum-metadata-parser', '1.1.2', '2', 'el5') **What I need

但是这些都不起作用,除非我破坏了以前工作过的其他一些内容。

  • wvdial-1.54.0-3
  • xdelta-1.1.3-20
  • xdelta-1.1.3- 20_2
  • xmlsec1-1.2.6-3
  • xmlsec1-1.2.6-3_2
  • ypbind-1.17.2-13
  • ypbind-1.17.2-8
  • ypserv-2.13-14
  • zip-2.3-27
  • zlib-1.2.3-3
  • zlib-1.2。 3-3_2
  • zsh-4.2.6-1

I'm trying to find out how I can compare 2 lists of RPMS (Currently installed) and (Available in local repository) and see which RPMS are out of date. I've been tinkering with regex but there are so many different naming standards for RPMS that i can't get a good list to work with. I don't have the actual RPMS on my drive so i can't do rpm -qif.

pattern1 = re.compile(r'^([a-zA-Z0-9_\-\+]*)-([a-zA-Z0-9_\.]*)-([a-zA-Z0-9_\.]*)\.(.*)')
for rpm in listOfRpms:
     packageInfo = pattern1.search(rpm[0]).groups()
     print packageInfo

This works for a vast majority but not all (2300 / 2400)

  yum-metadata-parser-1.1.2-2.el5
('yum-metadata-parser', '1.1.2', '2', 'el5') **What I need

But none these work for instance unless I break some others that worked before..

  • wvdial-1.54.0-3
  • xdelta-1.1.3-20
  • xdelta-1.1.3-20_2
  • xmlsec1-1.2.6-3
  • xmlsec1-1.2.6-3_2
  • ypbind-1.17.2-13
  • ypbind-1.17.2-8
  • ypserv-2.13-14
  • zip-2.3-27
  • zlib-1.2.3-3
  • zlib-1.2.3-3_2
  • zsh-4.2.6-1

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

美人迟暮 2024-09-16 04:14:35

在 RPM 术语中,2.el5 是发布字段; 2 和 el5 不是单独的字段。但是,正如您的示例所示,发布中不需要包含 . 。从末尾删除 \.(.*) 即可一次性捕获释放场。

现在您有了包名称、版本和发行版。比较它们的最简单方法是使用 rpm 的 python 模块:

import rpm
# t1 and t2 are tuples of (version, release)
def compare(t1, t2):
    v1, r1 = t1
    v2, r2 = t2
    return rpm.labelCompare(('1', v1, r1), ('1', v2, r2))

您可能会问,那个额外的 '1' 是什么?这是纪元,它优先于其他版本比较考虑因素。此外,它通常在文件名中不可用。在这里,出于本练习的目的,我们将其假装为“1”,但这可能根本不准确。如果您仅使用文件名,那么这是您的逻辑会出错的两个原因之一。

您的逻辑可能与 rpm 不同的另一个原因是 Obsoletes 字段,该字段允许将包升级为具有完全不同名称的包。如果您同意这些限制,请继续。

如果您手头没有 rpm python 库,以下是从 rpm 4.4.2.3 开始比较每个发行版、版本和纪元的逻辑:

  • 搜索每个字符串对于字母字段 [a-zA-Z]+ 和数字字段 [0-9]+,由垃圾 [^a-zA-Z0-9] 分隔*。
  • 每个字符串中的连续字段会相互比较。
  • 字母部分按字典顺序比较,数字部分按数字进行比较。
  • 如果出现一个字段为数字字段而另一个字段为字母字段的不匹配情况,则数字字段始终被视为更大(较新)。
  • 在一个字符串用完字段的情况下,另一个字符串始终被认为更大(更新)。

请参阅 RPM 源中的 lib/rpmvercmp.c 了解详细信息。

In RPM parlance, 2.el5 is the release field; 2 and el5 are not separate fields. However, release need not have a . in it as your examples show. Drop the \.(.*) from the end to capture the release field in one shot.

So now you have a package name, version, and release. The easiest way to compare them is to use rpm's python module:

import rpm
# t1 and t2 are tuples of (version, release)
def compare(t1, t2):
    v1, r1 = t1
    v2, r2 = t2
    return rpm.labelCompare(('1', v1, r1), ('1', v2, r2))

What's that extra '1', you ask? That's epoch, and it overrides other version comparison considerations. Further, it's generally not available in the filename. Here, we're faking it to '1' for purposes of this exercise, but that may not be accurate at all. This is one of two reasons your logic is going to be off if you're going by file names alone.

The other reason that your logic may be different from rpm's is the Obsoletes field, which allows a package to be upgraded to a package with an entirely different name. If you're OK with these limitations, then proceed.

If you don't have the rpm python library at hand, here's the logic for comparing each of release, version, and epoch as of rpm 4.4.2.3:

  • Search each string for alphabetic fields [a-zA-Z]+ and numeric fields [0-9]+ separated by junk [^a-zA-Z0-9]*.
  • Successive fields in each string are compared to each other.
  • Alphabetic sections are compared lexicographically, and the numeric sections are compared numerically.
  • In the case of a mismatch where one field is numeric and one is alphabetic, the numeric field is always considered greater (newer).
  • In the case where one string runs out of fields, the other is always considered greater (newer).

See lib/rpmvercmp.c in the RPM source for the gory details.

何以笙箫默 2024-09-16 04:14:35

这是一个基于 rpmdevtools 包中的 rpmdev-vercmp 的工作程序。除了 yum(提供 rpmUtils.miscutils python 模块)之外,您不需要安装任何特殊的东西即可使其工作。

与其他答案相比,优点是您不需要解析任何内容,只需向其提供完整的 RPM 名称版本字符串,例如:

$ ./rpmcmp.py bash-3.2-32.el5_9.1 bash-3.2-33.el5.1
0:bash-3.2-33.el5.1 is newer
$ echo $?
12

退出状态 11 表示第一个较新,12 表示第二个较新。

#!/usr/bin/python

import rpm
import sys
from rpmUtils.miscutils import stringToVersion

if len(sys.argv) != 3:
    print "Usage: %s <rpm1> <rpm2>"
    sys.exit(1)

def vercmp((e1, v1, r1), (e2, v2, r2)):
    return rpm.labelCompare((e1, v1, r1), (e2, v2, r2))

(e1, v1, r1) = stringToVersion(sys.argv[1])
(e2, v2, r2) = stringToVersion(sys.argv[2])

rc = vercmp((e1, v1, r1), (e2, v2, r2))
if rc > 0:
    print "%s:%s-%s is newer" % (e1, v1, r1)
    sys.exit(11)

elif rc == 0:
    print "These are equal"
    sys.exit(0)

elif rc < 0:
    print "%s:%s-%s is newer" % (e2, v2, r2)
    sys.exit(12)

Here's a working program based off of rpmdev-vercmp from the rpmdevtools package. You shouldn't need anything special installed but yum (which provides the rpmUtils.miscutils python module) for it to work.

The advantage over the other answers is you don't need to parse anything out, just feed it full RPM name-version strings like:

$ ./rpmcmp.py bash-3.2-32.el5_9.1 bash-3.2-33.el5.1
0:bash-3.2-33.el5.1 is newer
$ echo $?
12

Exit status 11 means the first one is newer, 12 means the second one is newer.

#!/usr/bin/python

import rpm
import sys
from rpmUtils.miscutils import stringToVersion

if len(sys.argv) != 3:
    print "Usage: %s <rpm1> <rpm2>"
    sys.exit(1)

def vercmp((e1, v1, r1), (e2, v2, r2)):
    return rpm.labelCompare((e1, v1, r1), (e2, v2, r2))

(e1, v1, r1) = stringToVersion(sys.argv[1])
(e2, v2, r2) = stringToVersion(sys.argv[2])

rc = vercmp((e1, v1, r1), (e2, v2, r2))
if rc > 0:
    print "%s:%s-%s is newer" % (e1, v1, r1)
    sys.exit(11)

elif rc == 0:
    print "These are equal"
    sys.exit(0)

elif rc < 0:
    print "%s:%s-%s is newer" % (e2, v2, r2)
    sys.exit(12)
夕色琉璃 2024-09-16 04:14:35

因为 python rpm 包看起来相当过时并且在 pip 中不可用;我编写了一个适用于大多数软件包版本的小型实现;包括 ~ 符号周围的魔法。这不会覆盖 100% 的实际实现,但它对大多数软件包都有效:

def rpm_sort(elements):
    """ sort list elements using 'natural sorting': 1.10 > 1.9 etc...
        taking into account special characters for rpm (~) """

    alphabet = "~0123456789abcdefghijklmnopqrstuvwxyz-."

    def convert(text):
        return [int(text)] if text.isdigit() else ([alphabet.index(letter) for letter in text.lower()] if text else [1])

    def alphanum_key(key):
        return [convert(c) for c in re.split('([0-9]+)', key)]
    return sorted(elements, key=alphanum_key)

已测试:

rpms = ['my-package-0.2.1-0.dev.20180810',
        'my-package-0.2.2-0~.dev.20181011',
        'my-package-0.2.2-0~.dev.20181012',
        'my-package-0.2.2-0',
        'my-package-0.2.2-0.dev.20181217']
self.assertEqual(rpms, rpm_sort(rpms))

未涵盖

目前,我知道只有一种情况未涵盖,但可能会弹出其他一些情况:字~> word 而根据 rpm 规范,反之亦然(任何以字母结尾的单词,最后是 ~

since the python rpm package seems quite outdated and not available in pip; I wrote a small implementation that works for most package versions; including the magic around ~ signs. This won't cover 100% of the real implementation, but it does the trick for most packages:

def rpm_sort(elements):
    """ sort list elements using 'natural sorting': 1.10 > 1.9 etc...
        taking into account special characters for rpm (~) """

    alphabet = "~0123456789abcdefghijklmnopqrstuvwxyz-."

    def convert(text):
        return [int(text)] if text.isdigit() else ([alphabet.index(letter) for letter in text.lower()] if text else [1])

    def alphanum_key(key):
        return [convert(c) for c in re.split('([0-9]+)', key)]
    return sorted(elements, key=alphanum_key)

tested:

rpms = ['my-package-0.2.1-0.dev.20180810',
        'my-package-0.2.2-0~.dev.20181011',
        'my-package-0.2.2-0~.dev.20181012',
        'my-package-0.2.2-0',
        'my-package-0.2.2-0.dev.20181217']
self.assertEqual(rpms, rpm_sort(rpms))

Not covered

For the moment there is only one case I know that is not covered, but some others might pop up: word~ > word while according to rpm specification the inverse should be true (any word ending with letters and then a final ~)

秉烛思 2024-09-16 04:14:35

RPM 具有 python 绑定,可让您使用 rpmUtils.miscutils.compareEVR。元组的第一个和第三个参数是包名称和打包版本。中间是版本。在下面的示例中,我试图找出 3.7.4a 的排序位置。

[root@rhel56 ~]# python
Python 2.4.3 (#1, Dec 10 2010, 17:24:35) 
[GCC 4.1.2 20080704 (Red Hat 4.1.2-50)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import rpmUtils.miscutils
>>> rpmUtils.miscutils.compareEVR(("foo", "3.7.4", "1"), ("foo", "3.7.4", "1"))
0
>>> rpmUtils.miscutils.compareEVR(("foo", "3.7.4", "1"), ("foo", "3.7.4a", "1")) 
-1
>>> rpmUtils.miscutils.compareEVR(("foo", "3.7.4a", "1"), ("foo", "3.7.4", "1")) 
1

RPM has python bindings, which lets you use rpmUtils.miscutils.compareEVR. The first and third arguments of the tuple are the package name and the packaging version. The middle is the version. In the example below, I'm trying to figure out where 3.7.4a gets sorted.

[root@rhel56 ~]# python
Python 2.4.3 (#1, Dec 10 2010, 17:24:35) 
[GCC 4.1.2 20080704 (Red Hat 4.1.2-50)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import rpmUtils.miscutils
>>> rpmUtils.miscutils.compareEVR(("foo", "3.7.4", "1"), ("foo", "3.7.4", "1"))
0
>>> rpmUtils.miscutils.compareEVR(("foo", "3.7.4", "1"), ("foo", "3.7.4a", "1")) 
-1
>>> rpmUtils.miscutils.compareEVR(("foo", "3.7.4a", "1"), ("foo", "3.7.4", "1")) 
1
清风不识月 2024-09-16 04:14:35

基于 Owen S 的出色回答,我整理了一个片段,该片段使用系统 RPM 绑定(如果可用),但否则会回退到基于正则表达式的模拟:

try:
    from rpm import labelCompare as _compare_rpm_labels
except ImportError:
    # Emulate RPM field comparisons
    #
    # * Search each string for alphabetic fields [a-zA-Z]+ and
    #   numeric fields [0-9]+ separated by junk [^a-zA-Z0-9]*.
    # * Successive fields in each string are compared to each other.
    # * Alphabetic sections are compared lexicographically, and the
    #   numeric sections are compared numerically.
    # * In the case of a mismatch where one field is numeric and one is
    #   alphabetic, the numeric field is always considered greater (newer).
    # * In the case where one string runs out of fields, the other is always
    #   considered greater (newer).

    import warnings
    warnings.warn("Failed to import 'rpm', emulating RPM label comparisons")

    try:
        from itertools import zip_longest
    except ImportError:
        from itertools import izip_longest as zip_longest

    _subfield_pattern = re.compile(
        r'(?P<junk>[^a-zA-Z0-9]*)((?P<text>[a-zA-Z]+)|(?P<num>[0-9]+))'
    )

    def _iter_rpm_subfields(field):
        """Yield subfields as 2-tuples that sort in the desired order

        Text subfields are yielded as (0, text_value)
        Numeric subfields are yielded as (1, int_value)
        """
        for subfield in _subfield_pattern.finditer(field):
            text = subfield.group('text')
            if text is not None:
                yield (0, text)
            else:
                yield (1, int(subfield.group('num')))

    def _compare_rpm_field(lhs, rhs):
        # Short circuit for exact matches (including both being None)
        if lhs == rhs:
            return 0
        # Otherwise assume both inputs are strings
        lhs_subfields = _iter_rpm_subfields(lhs)
        rhs_subfields = _iter_rpm_subfields(rhs)
        for lhs_sf, rhs_sf in zip_longest(lhs_subfields, rhs_subfields):
            if lhs_sf == rhs_sf:
                # When both subfields are the same, move to next subfield
                continue
            if lhs_sf is None:
                # Fewer subfields in LHS, so it's less than/older than RHS
                return -1
            if rhs_sf is None:
                # More subfields in LHS, so it's greater than/newer than RHS
                return 1
            # Found a differing subfield, so it determines the relative order
            return -1 if lhs_sf < rhs_sf else 1
        # No relevant differences found between LHS and RHS
        return 0


    def _compare_rpm_labels(lhs, rhs):
        lhs_epoch, lhs_version, lhs_release = lhs
        rhs_epoch, rhs_version, rhs_release = rhs
        result = _compare_rpm_field(lhs_epoch, rhs_epoch)
        if result:
            return result
        result = _compare_rpm_field(lhs_version, rhs_version)
        if result:
            return result
        return _compare_rpm_field(lhs_release, rhs_release)

请注意,我尚未对此进行了广泛的测试,以确保与C 级别实现 - 我仅将其用作后备实现,该实现至少足以让 Anitya 的测试套件在系统 RPM 绑定不可用的环境中通过。

Based on Owen S's excellent answer, I put together a snippet that uses the system RPM bindings if available, but falls back to a regex based emulation otherwise:

try:
    from rpm import labelCompare as _compare_rpm_labels
except ImportError:
    # Emulate RPM field comparisons
    #
    # * Search each string for alphabetic fields [a-zA-Z]+ and
    #   numeric fields [0-9]+ separated by junk [^a-zA-Z0-9]*.
    # * Successive fields in each string are compared to each other.
    # * Alphabetic sections are compared lexicographically, and the
    #   numeric sections are compared numerically.
    # * In the case of a mismatch where one field is numeric and one is
    #   alphabetic, the numeric field is always considered greater (newer).
    # * In the case where one string runs out of fields, the other is always
    #   considered greater (newer).

    import warnings
    warnings.warn("Failed to import 'rpm', emulating RPM label comparisons")

    try:
        from itertools import zip_longest
    except ImportError:
        from itertools import izip_longest as zip_longest

    _subfield_pattern = re.compile(
        r'(?P<junk>[^a-zA-Z0-9]*)((?P<text>[a-zA-Z]+)|(?P<num>[0-9]+))'
    )

    def _iter_rpm_subfields(field):
        """Yield subfields as 2-tuples that sort in the desired order

        Text subfields are yielded as (0, text_value)
        Numeric subfields are yielded as (1, int_value)
        """
        for subfield in _subfield_pattern.finditer(field):
            text = subfield.group('text')
            if text is not None:
                yield (0, text)
            else:
                yield (1, int(subfield.group('num')))

    def _compare_rpm_field(lhs, rhs):
        # Short circuit for exact matches (including both being None)
        if lhs == rhs:
            return 0
        # Otherwise assume both inputs are strings
        lhs_subfields = _iter_rpm_subfields(lhs)
        rhs_subfields = _iter_rpm_subfields(rhs)
        for lhs_sf, rhs_sf in zip_longest(lhs_subfields, rhs_subfields):
            if lhs_sf == rhs_sf:
                # When both subfields are the same, move to next subfield
                continue
            if lhs_sf is None:
                # Fewer subfields in LHS, so it's less than/older than RHS
                return -1
            if rhs_sf is None:
                # More subfields in LHS, so it's greater than/newer than RHS
                return 1
            # Found a differing subfield, so it determines the relative order
            return -1 if lhs_sf < rhs_sf else 1
        # No relevant differences found between LHS and RHS
        return 0


    def _compare_rpm_labels(lhs, rhs):
        lhs_epoch, lhs_version, lhs_release = lhs
        rhs_epoch, rhs_version, rhs_release = rhs
        result = _compare_rpm_field(lhs_epoch, rhs_epoch)
        if result:
            return result
        result = _compare_rpm_field(lhs_version, rhs_version)
        if result:
            return result
        return _compare_rpm_field(lhs_release, rhs_release)

Note that I haven't tested this extensively for consistency with the C level implementation - I only use it as a fallback implementation that's at least good enough to let Anitya's test suite pass in environments where system RPM bindings aren't available.

壹場煙雨 2024-09-16 04:14:35

一个更简单的正则表达式是 /^(.+)-(.+)-(.+)\.(.+)\.rpm$/

我不知道对包有任何限制名称(第一次捕获)。对版本和发布的唯一限制是它们不包含“-”。无需对此进行编码,因为未捕获的“-”分隔了这些字段,因此如果确实有“-”,它将被拆分而不是单个字段,因此捕获的结果将不包含“-” 。只有第一个捕获(名称)包含任何“-”,因为它首先消耗所有无关的“-”。

然后是架构,该正则表达式假定对架构名称没有限制,只是它不包含“.”。

捕获结果为 [名称、版本、发行版、架构]

Owen 的回答中有关仅依赖 rpm 名称的警告仍然适用。

现在您必须比较版本字符串,这并不简单。我不相信这可以用正则表达式来完成。您需要实现比较算法。

A much simpler regex is /^(.+)-(.+)-(.+)\.(.+)\.rpm$/

I'm not aware of any restrictions on the package name (first capture). The only restrictions on version and release are that they do not contain '-'. There is no need to code this, as the uncaptured '-'s separate those fields, thus if one did have a '-' it would be split and not be a single feild, ergo the resulting capture would not contain a '-'. Only the first capture, the name, contains any '-' because it consumes all extraneous '-' first.

Then, there's the architecture, which this regex assumes no restriction on the architecture name, except that it not contain a '.'.

The capture results are [name, version, release, arch]

Caveats from Owen's answer about relying on the rpm name alone still apply.

Now you have to compare the version strings, which is not straightforward. I don't believe that can be done with a regex. You'd need to implement the comparison algorithm.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文