zip 函数(如在 Python 或 C# 4.0 中)的用途是什么?

发布于 2024-08-24 19:35:02 字数 152 浏览 6 评论 0原文

有人问如何在C#中进行Python的zip?...

...这让我问,zip有什么好处?在什么场景下我需要这个?它真的如此基础以至于我需要在基类库中使用它吗?

Someone asked How to do Python’s zip in C#?...

...which leads me to ask, what good is zip? In what scenarios do I need this? Is it really so foundational that I need this in the base class library?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

雨夜星沙 2024-08-31 19:35:03

实际上有人问了一个问题 最近,我用 Zip 扩展方法进行了回答,因此它对于某些人来说显然很重要。 ;)

实际上,对于数学算法来说,这是一个相当重要的操作 - 矩阵、曲线拟合、插值、模式识别等。在数字信号处理等工程应用中也非常重要,其中您所做的大部分工作是组合多个信号或对其应用线性变换 - 两者都基于样本索引,因此将其压缩。压缩两个序列比根据某个键对它们进行排序和连接要快得多,尤其是当您事先知道序列具有相同数量的元素并且顺序相同时。

由于我目前的工作,我无法在这里详细说明,但总的来说,这对于遥测数据(工业、科学等)也很有价值。通常,您会拥有来自数百或数千个点(并行源)的数据时间序列,并且您需要通过设备(而不是随着时间的推移)进行聚合,但水平。最后,您需要另一个时间序列,但具有所有单独点的总和或平均值或其他聚合。

这听起来像是 SQL Server 中的简单排序/分组/连接(例如),但实际上很难以这种方式高效地完成。一方面,时间戳可能不完全匹配,但您不关心几毫秒的差异,因此您最终必须生成代理键/行号并对其进行分组 - 当然,还有代理行号只不过是您已经拥有的时间索引。压缩简单、快速且可无限并行。

我不知道是否可以称其为“基础”,但它“很重要”。我也不经常使用 Reverse 方法,但出于同样的原因,我很高兴在我确实需要它的极少数情况下,我不必继续自己编写它。

现在对您来说似乎没那么有用的原因之一是 .NET/C# 3.5 没有元组。 C# 4 确实有元组,当您使用元组时,压缩实际上是一项基本操作,因为顺序是严格执行的。

Someone actually asked a question here fairly recently that I answered with the Zip extension method, so it's obviously important for some people. ;)

Actually, it is a fairly important operation for mathematical algorithms - matrices, curve fitting, interpolation, pattern recognition, that sort of thing. Also very important in engineering applications like digital signal processing where much of what you do is combine multiple signals or apply linear transforms to them - both are based on the sample index, hence, zip it. Zipping two sequences is far, far faster than sorting and joining them based on some key, especially when you know in advance that the sequences have the same number of elements and are in the same order.

I can't get into tight specifics here on account of my current employment, but speaking generally, this is also valuable for telemetry data - industrial, scientific, that sort of thing. Often you'll have time sequences of data coming from hundreds or thousands of points - parallel sources - and you need to aggregate, but horizontally, over devices, not over time. At the end, you want another time sequence, but with the sum or average or some other aggregate of all the individual points.

It may sound like a simple sort/group/join in SQL Server (for example) but it's actually really hard to do efficiently this way. For one thing, the timestamps may not match exactly, but you don't care about differences of a few milliseconds, so you end up having to generate a surrogate key/row number and group on that - and of course, the surrogate row number is nothing more than the time index which you already had. Zipping is simple, fast, and infinitely parallelizable.

I don't know if I'd call it foundational, but it it is important. I don't use the Reverse method very often either, but by the same token I'm glad I don't have to keep writing it myself on those rare occasions when I do find a need for it.

One of the reasons it might not seem that useful to you now is that .NET/C# 3.5 does not have tuples. C# 4 does have tuples, and when you're working with tuples, zipping really is a fundamental operation because ordering is strictly enforced.

攒一口袋星星 2024-08-31 19:35:03

用例:

>>> fields = ["id", "name", "location"]
>>> values = ["13", "bill", "redmond"]
>>> dict(zip(fields, values))
{'location': 'redmond', 'id': '13', 'name': 'bill'}

尝试在不使用 zip 的情况下执行此操作...

A use case:

>>> fields = ["id", "name", "location"]
>>> values = ["13", "bill", "redmond"]
>>> dict(zip(fields, values))
{'location': 'redmond', 'id': '13', 'name': 'bill'}

Try doing this without zip...

勿挽旧人 2024-08-31 19:35:03

如果您想同时迭代多个可迭代对象,那么 zip 很有用,这是 Python 中相当常见的场景。

zip 对我来说派上用场的一个现实场景是,如果您有一个 M × N 数组,并且您想要查看列而不是行。例如:

>>> five_by_two = ((0, 1), (1, 2), (2, 3), (3, 4), (4, 5))
>>> two_by_five = tuple(zip(*five_by_two))
>>> two_by_five
((0, 1, 2, 3, 4), (1, 2, 3, 4, 5))

zip is useful if you'd like to iterate over multiple iterables simultaneously, which is a reasonably common scenario in Python.

One real-world scenario where zip has come in handy for me is if you have an M by N array, and you want to look at columns instead of rows. For example:

>>> five_by_two = ((0, 1), (1, 2), (2, 3), (3, 4), (4, 5))
>>> two_by_five = tuple(zip(*five_by_two))
>>> two_by_five
((0, 1, 2, 3, 4), (1, 2, 3, 4, 5))
明月夜 2024-08-31 19:35:03

它允许您并行处理序列,而不是顺序或嵌套处理。它有……如此多的用途,以至于我目前无法理解它们。

It allows you to process sequences in parallel instead of sequentially or nested. There's... so many uses for it that they currently escape me.

红衣飘飘貌似仙 2024-08-31 19:35:03

这是 zip 的一个常见用例:

x = [1,2,3,4,5]
y = [6,7,8,9,0]

for a,b in zip(x,y):
    print a, b

它将输出:

1 6
2 7
3 8
4 9
5 0

Here's a common use case for zip:

x = [1,2,3,4,5]
y = [6,7,8,9,0]

for a,b in zip(x,y):
    print a, b

Which would output:

1 6
2 7
3 8
4 9
5 0
酒解孤独 2024-08-31 19:35:03

在不同的地方都很方便。我最喜欢的是来自 http://norvig.com/python-iaq.html 的转置一个矩阵:

>>> x = [ [1,2,3,4,5],[6,7,8,9,10],[11,12,13,14,15]]
>>> zip(*x)
[(1, 6, 11), (2, 7, 12), (3, 8, 13), (4, 9, 14), (5, 10, 15)]

It's handy in different places. My favorite, from http://norvig.com/python-iaq.html, is transposing a matrix:

>>> x = [ [1,2,3,4,5],[6,7,8,9,10],[11,12,13,14,15]]
>>> zip(*x)
[(1, 6, 11), (2, 7, 12), (3, 8, 13), (4, 9, 14), (5, 10, 15)]
記憶穿過時間隧道 2024-08-31 19:35:03

下面是我在 Python 类中使用 zip() 来比较版本号的情况:

class Version(object):

    # ... snip ...

    def get_tuple(self):
        return (self.major, self.minor, self.revision)

    def compare(self, other):
        def comp(a, b):
            if a == '*' or b == '*':
                return 0
            elif a == b:
                return 0
            elif a < b:
                return -1
            else:
                return 1
        return tuple(comp(a, b) for a, b in zip(self.get_tuple(), Version(other).get_tuple()))

    def is_compatible(self, other):
        tup = self.compare(other)
        return (tup[0] == 0 and tup[1] == 0)

    def __eq__(self, other):
        return all(x == 0 for x in self.compare(other))

    def __ne__(self, other):
        return any(x != 0 for x in self.compare(other))

    def __lt__(self, other):
        for x in self.compare(other):
            if x < 0:
                return True
            elif x > 0:
                return False
        return False

    def __gt__(self, other):
        for x in self.compare(other):
            if x > 0:
                return True
            elif x < 0:
                return False
        return False

我认为 zip()all() 相结合any() 使比较运算符的实现特别清晰和优雅。当然,无需 zip() 也可以完成此操作,但几乎任何语言功能都可以这样说。

Here's a case where I used zip() to useful effect, in a Python class for comparing version numbers:

class Version(object):

    # ... snip ...

    def get_tuple(self):
        return (self.major, self.minor, self.revision)

    def compare(self, other):
        def comp(a, b):
            if a == '*' or b == '*':
                return 0
            elif a == b:
                return 0
            elif a < b:
                return -1
            else:
                return 1
        return tuple(comp(a, b) for a, b in zip(self.get_tuple(), Version(other).get_tuple()))

    def is_compatible(self, other):
        tup = self.compare(other)
        return (tup[0] == 0 and tup[1] == 0)

    def __eq__(self, other):
        return all(x == 0 for x in self.compare(other))

    def __ne__(self, other):
        return any(x != 0 for x in self.compare(other))

    def __lt__(self, other):
        for x in self.compare(other):
            if x < 0:
                return True
            elif x > 0:
                return False
        return False

    def __gt__(self, other):
        for x in self.compare(other):
            if x > 0:
                return True
            elif x < 0:
                return False
        return False

I think zip(), coupled with all() and any(), makes the comparison operator implementations particularly clear and elegant. Sure, it could have been done without zip(), but then the same could be said about practically any language feature.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文