当提供空列表时 itertools.product() 应该产生什么?

发布于 2024-09-07 22:14:50 字数 1217 浏览 8 评论 0原文

我想这是一个学术问题,但第二个结果对我来说没有意义。难道它不应该像第一个一样完全是空的吗?这种行为的理由是什么?

from itertools import product

one_empty = [ [1,2], [] ]
all_empty = []

print [ t for t in product(*one_empty) ]  # []
print [ t for t in product(*all_empty) ]  # [()]

更新

感谢您的所有回答——内容非常丰富。

维基百科对零笛卡尔积的讨论提供了明确的声明:

无集合的笛卡尔积... 是包含以下内容的单例集 空元组。

这里有一些代码,您可以用来完成富有洞察力的 某处的回答

from itertools import product

def tproduct(*xss):
    return ( sum(rs, ()) for rs in product(*xss) )

def tup(x):
    return (x,)

xs = [ [1, 2],     [3, 4, 5]       ]
ys = [ ['a', 'b'], ['c', 'd', 'e'] ]

txs = [ map(tup, x) for x in xs ]  # [[(1,), (2,)], [(3,), (4,), (5,)]]
tys = [ map(tup, y) for y in ys ]  # [[('a',), ('b',)], [('c',), ('d',), ('e',)]]

a = [ p for p in tproduct( *(txs + tys) )                   ]
b = [ p for p in tproduct( tproduct(*txs), tproduct(*tys) ) ]

assert a == b

I guess it's an academic question, but the second result does not make sense to me. Shouldn't it be as thoroughly empty as the first? What is the rationale for this behavior?

from itertools import product

one_empty = [ [1,2], [] ]
all_empty = []

print [ t for t in product(*one_empty) ]  # []
print [ t for t in product(*all_empty) ]  # [()]

Updates

Thanks for all of the answers -- very informative.

Wikipedia's discussion of the Nullary Cartesian Product provides a definitive statement:

The Cartesian product of no sets ...
is the singleton set containing the
empty tuple.

And here is some code you can use to work through the insightful answer from sth:

from itertools import product

def tproduct(*xss):
    return ( sum(rs, ()) for rs in product(*xss) )

def tup(x):
    return (x,)

xs = [ [1, 2],     [3, 4, 5]       ]
ys = [ ['a', 'b'], ['c', 'd', 'e'] ]

txs = [ map(tup, x) for x in xs ]  # [[(1,), (2,)], [(3,), (4,), (5,)]]
tys = [ map(tup, y) for y in ys ]  # [[('a',), ('b',)], [('c',), ('d',), ('e',)]]

a = [ p for p in tproduct( *(txs + tys) )                   ]
b = [ p for p in tproduct( tproduct(*txs), tproduct(*tys) ) ]

assert a == b

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

电影里的梦 2024-09-14 22:14:50

从数学的角度来看,没有元素的乘积应该产生操作乘积的中性元素,无论它是什么。

例如,对于整数,乘法的中性元素是 1,因为对于所有整数 a1 ⋅ a = a。因此整数的空乘积应该是1。当实现返回数字列表的乘积的 python 函数时,这种情况自然会发生:

def iproduct(lst):
  result = 1
  for i in lst:
    result *= i
  return result

为了使用此算法计算出正确的结果,需要使用 1result >。当在空列表上调用该函数时,这会导致返回值为 1

这个返回值对于函数的目的来说也是非常合理的。有了良好的乘积函数,无论您首先连接两个列表然后构建元素的乘积,还是先构建两个单独列表的乘积然后将结果相乘,都没有关系:

iproduct(xs + ys) == iproduct(xs) * iproduct(ys)

If xsys 为空,仅在 iproduct([]) == 1 时有效。

现在迭代器上的 product() 更加复杂。同样,从数学的角度来看,product([]) 应该返回该操作的中性元素,无论它是什么。它不是 [],因为 product([], xs) == [],而对于中性元素 product([], xs) == xs 应该成立。但事实证明,[()] 也不是一个中性元素:

>>> list(product([()], [1,2,3]))
[((), 1), ((), 2), ((), 3)]

事实上,product() 根本不是一个非常好的数学乘积,因为上面的等式不成立:

product(*(xs + ys)) != product(product(*xs), product(*ys))

乘积的每个应用都会生成一个附加的元组层,并且没有办法解决这个问题,因此甚至不可能有真正的中性元素。 [()] 非常接近,它不会添加或删除任何元素,它只是向每个元素添加一个空元组。

[()]实际上是这个稍微修改过的乘积函数的中性元素,它仅对元组列表进行操作,但不会在每个应用程序上添加额外的元组层:

def tproduct(*xss):
  # the parameters have to be lists of tuples
  return (sum(rs, ()) for rs in product(*xss))

对于此函数,上述乘积等式成立:

def tup(x): return (x,)
txs = [map(tup, x) for x in xs]
tys = [map(tup, y) for y in ys]
tproduct(*(txs + tys)) == tproduct(tproduct(*txs), tproduct(*tys))

通过将输入列表打包到元组中的额外预处理步骤,tproduct() 给出与 product() 相同的结果,但从数学角度来看表现更好看法。而且它的中性元素是[()]

所以[()]作为这种列表乘法的中性元素是有意义的。即使它不完全适合 product(),它也是该函数的一个不错的选择,因为它允许定义 tproduct() 而无需引入特殊的空输入的情况。

From a mathematical point of view the product over no elements should yield the neutral element of the operation product, whatever that is.

For example on integers the neutral element of multiplication is 1, since 1 ⋅ a = a for all integers a. So an empty product of integers should be 1. When implementing a python function that returns the product of a list of numbers, this happens naturally:

def iproduct(lst):
  result = 1
  for i in lst:
    result *= i
  return result

For the correct result to be calculated with this algorithm, result needs to be initialized with 1. This leads to a return value of 1 when the function is called on an empty list.

This return value is also very reasonable for the purpose of the function. With a good product function it shouldn't matter if you first concat two lists and then build the product of the elements, or if you first build the product of both individual lists and then multiply the results:

iproduct(xs + ys) == iproduct(xs) * iproduct(ys)

If xs or ys is empty that only works if iproduct([]) == 1.

Now the more complicated product() on iterators. Here also, from a mathematical point of view, product([]) should return the neutral element of that operation, whatever that is. It is not [] since product([], xs) == [], while for the neutral elements product([], xs) == xs should hold. It turns out, though, that [()] also isn't a neutral element:

>>> list(product([()], [1,2,3]))
[((), 1), ((), 2), ((), 3)]

In fact, product() is not really a very nice mathematical product at all, since this above equation doesn't hold:

product(*(xs + ys)) != product(product(*xs), product(*ys))

Each application of product generates an additional layer of tuples and there is no way around that, so there can't even be a real neutral element. [()] comes pretty close though, it doesn't add or remove any elements, it just adds an empty tuple to each.

[()]would in fact be the neutral element of this slightly adapted product function that only operates on lists of tuples, but doesn't add additional tuple layers on each application:

def tproduct(*xss):
  # the parameters have to be lists of tuples
  return (sum(rs, ()) for rs in product(*xss))

For this function the above product equation holds:

def tup(x): return (x,)
txs = [map(tup, x) for x in xs]
tys = [map(tup, y) for y in ys]
tproduct(*(txs + tys)) == tproduct(tproduct(*txs), tproduct(*tys))

With the additional preprocessing step of packing the input lists into tuples, tproduct() gives the same result as product(), but behaves nicer from a mathematical point of view. Also its neutral element is [()],

So [()] makes some sense as the neutral element of this kind of list multiplication. Even if it doesn't exactly fit product() it is a good choice for this function since it for example allows to define tproduct() without the need to introduce a special case for empty input.

双马尾 2024-09-14 22:14:50

正如@sth 已经指出的那样,从数学角度来看,这种行为是正确的。您真正需要说服自己的是 list(itertools.product()) 应该只有一个元素,因为一旦您知道该元素应该是什么:它必须是(对于一致性)一个长度为 0 的元组,并且只有一个。

但是 itertools.product(l1, l2, l3, ...) 的元素数量应该只是 l1, l2< 长度的乘积/code>, l3, ... .因此 itertools.product() 的元素数量应该是 的大小空乘积,并且不乏互联网资源可以说服您空乘积是 1。

我只是想指出,这是正确的实用定义以及正确的定义数学一;也就是说,它是最有可能在边界情况下“起作用”的定义。例如,假设您要生成由十进制数字组成的长度为 n 的所有字符串,且第一位数字非零。您可能会这样做:

import itertools

def decimal_strings(n):
    """Generate all digit strings of length n that don't start with 0."""
    for lead_digit in '123456789':
        for tail in itertools.product('0123456789', repeat=n-1):
            yield lead_digit + ''.join(tail)

n = 1时,这会产生什么结果?好吧,在这种情况下,您最终会使用空产品 (repeat = 0) 调用 itertools.product。如果它什么也没返回,那么上面的内部 for 循环体将永远不会被执行,因此 decimal_strings(1) 将是一个空迭代器;几乎肯定不是你想要的。但由于 itertools.product('0123456789', Repeat=0) 返回单个元组,因此您会得到预期的结果:(

>>> list(decimal_strings(1))
['1', '2', '3', '4', '5', '6', '7', '8', '9']

n = 0 时,当然,这个函数正确地引发一个 ValueError。)

所以简而言之,这个定义在数学上是合理的,但更常见的是,它并不是您想要的。这绝对不是Python bug!

As @sth already indicated, this behaviour is correct from a mathematical viewpoint. All you really need to convince yourself of is that list(itertools.product()) should have exactly one element, since once you know that it's clear what that element should be: it's got to be (for consistency) a tuple of length 0, and there's only one of those.

But the number of elements of itertools.product(l1, l2, l3, ...) should just be the product of the lengths of l1, l2, l3, ... . So the number of elements of itertools.product() should be the size of the empty product, and there's no shortage of internet sources that should persuade you that the empty product is 1.

I just wanted to point out that this is the correct practical definition as well as the correct mathematical one; that is, it's the definition that's most likely to 'just work' in boundary cases. For an example, suppose that you want to generate all strings of length n consisting of decimal digits, with the first digit nonzero. You might do something like:

import itertools

def decimal_strings(n):
    """Generate all digit strings of length n that don't start with 0."""
    for lead_digit in '123456789':
        for tail in itertools.product('0123456789', repeat=n-1):
            yield lead_digit + ''.join(tail)

What should this produce when n = 1? Well, in that case, you end up calling itertools.product with an empty product (repeat = 0). If it returned nothing, then the body of the inner for loop above would never be executed, so decimal_strings(1) would be an empty iterator; almost certainly not what you want. But since itertools.product('0123456789', repeat=0) returns a single tuple, you get the expected result:

>>> list(decimal_strings(1))
['1', '2', '3', '4', '5', '6', '7', '8', '9']

(When n = 0, of course, this function correctly raises a ValueError.)

So in short, the definition is mathematically sound, and more often that not it's also what you want. It's definitely not a Python bug!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文