Python 集合运算的运算符和非运算符版本之间的差异

发布于 2025-01-03 23:29:18 字数 294 浏览 2 评论 0原文

使用 intersection() 方法或 python 集合上的 & 运算符。我读到,在以前的版本中, & 的参数必须是一个集合,而不仅仅是任何可迭代的,尽管现在似乎不再是这样了。

在语义、约束、性能或简单的 Python 风格方面有区别吗?

What is the difference between using the intersection() method or the & operator on python sets. I read about how in previous versions the arguments to & had to be a set and not just any iterable although that seems to be no longer the case.

Is there a difference in terms of semantics, constraints, performance or simply pythonic style?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

瘫痪情歌 2025-01-10 23:29:18

方法可以绑定到名称以供以后使用,而运算符可以替换为operator 模块中的操作以实现更大的抽象。

The methods can be bound to names for later use, whereas the operators can be replaced by the operations in the operator module for the purpose of larger abstraction.

琉璃梦幻 2025-01-10 23:29:18

功能上没有区别,尽管使用运算符速度稍快一些,因为 Python 会在特殊情况下访问这些方法。大多数程序中的性能差异并没有大到需要使用运算符的程度。

There is no difference in functionality, although using the operators is a little faster because Python special-cases access to these methods. The performance difference in most programs is not so great as to demand that the operators be used.

木槿暧夏七纪年 2025-01-10 23:29:18

intersection() 将接受任何可迭代对象,而运算符仅接受集合类型。

该信息位于 docs 中的方法描述下方:

注意,union()、intersection()、的非运算符版本
Difference() 和 symmetry_difference()、issubset() 和 issuperset()
方法将接受任何可迭代对象作为参数。相比之下,他们的
基于运算符的对应项需要设置其参数。

The methods like intersection() will accept any iterable, wheras the operators will only accept set types.

The info is below the method description in the docs:

Note, the non-operator versions of union(), intersection(),
difference(), and symmetric_difference(), issubset(), and issuperset()
methods will accept any iterable as an argument. In contrast, their
operator based counterparts require their arguments to be sets.

逆光下的微笑 2025-01-10 23:29:18

以下是在 3.7 GHz CPU 上完成的 Python 3 的一些计时...

intersection 是我认为性能差异可以忽略不计的唯一一个,但其他操作在“非操作符”处似乎更快版本使用灵活性来允许任何可迭代的作为参数(而不仅仅是显式的集合)。

如果决定是任意的,则显式创建 set 可能(明显或不明显)对性能产生影响,以选择将现有的非 set iterable 转换为set 只是为了使用操作员版本。

设置

import random

nums = random.choices(range(1, 10000), k=10000) #this is a list
all_nums = set(range(1, 10000))

union |

%timeit all_nums.union(nums)
248 µs ± 4.44 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit all_nums | set(nums)
409 µs ± 5.25 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Difference -

%timeit all_nums.difference(nums)
387 µs ± 2.35 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit all_nums - set(nums)
451 µs ± 1.59 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

junction &

%timeit all_nums.intersection(nums)
477 µs ± 4.57 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit all_nums & set(nums)
479 µs ± 2.23 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

symmetry_difference ^

%timeit all_nums.symmetric_difference(nums)
421 µs ± 840 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit all_nums ^ set(nums)
557 µs ± 1.98 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

我应该补充一点,上面显示的性能影响是由于显式创建集合。如果参数已经是一个集合,@kindall 有正确的答案(运算符版本可能更快)。

Here are some timings in Python 3 done on 3.7 GHz CPU...

intersection is the only one that I would say has negligible difference in performance, but the other operations seem faster where the "non-operator" version is using the flexibility to allow any iterable as an argument (not just an explicit set).

It seems that explicitly creating a set may (obviously or not) be a performance impact, if the decision is otherwise arbitrary, to choose between converting an existing non-set iterable into a set just to use the operator version.

Setup

import random

nums = random.choices(range(1, 10000), k=10000) #this is a list
all_nums = set(range(1, 10000))

union |

%timeit all_nums.union(nums)
248 µs ± 4.44 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit all_nums | set(nums)
409 µs ± 5.25 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

difference -

%timeit all_nums.difference(nums)
387 µs ± 2.35 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit all_nums - set(nums)
451 µs ± 1.59 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

intersection &

%timeit all_nums.intersection(nums)
477 µs ± 4.57 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit all_nums & set(nums)
479 µs ± 2.23 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

symmetric_difference ^

%timeit all_nums.symmetric_difference(nums)
421 µs ± 840 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit all_nums ^ set(nums)
557 µs ± 1.98 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

I should add that the performance impact shown above is due to explicitly creating the set. If the argument is already a set, @kindall has the right answer (operator version may be faster).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文