按类别创建项目列表的受限排列

发布于 2024-09-18 10:22:34 字数 1467 浏览 6 评论 0原文

我正在尝试创建项目列表的一些受限排列。每个项目都有一个类别,我需要找到项目的组合,以便每个组合不包含同一类别的多个项目。为了说明这一点,这里有一些示例数据:

   Name      | Category
   ==========|==========
1. Orange    | fruit
2. Apple     | fruit
3. GI-Joe    | toy
4. VCR       | electronics
5. Racquet   | sporting goods

组合将限制为长度三,我不需要每种长度的每种组合。因此,上述列表的一组组合可能是:

(Orange, GI-Joe, VCR)
(Orange, GI-Joe, Racquet)
(Orange, VCR, Racquet)
(Apple,  GI-Joe, VCR)
(Apple,  GI-Joe, Racquet)
... and so on.

我经常在各种列表上这样做。这些列表的长度永远不会超过 40 个项目,但可以理解的是,这可能会创建数千种组合(尽管每个列表可能有大约 10 个独特的类别,这在一定程度上限制了它)

我想出了一些伪 python 来解释如何我会递归地实现它。我已经很久没有学习组合学了,但据我记得,这本质上是集合组合的子集,类似于 C(列表长度,所需大小)。可能有一些库模块可以使这个更干净(或者至少性能更高),

我想知道是否有比我所拥有的更好的方法(也许以某种方式使用 itertools.combinations ) :

# For the sake of this problem, let's assume the items are hashable so they
# can be added to a set.

def combinate(items, size=3):
    assert size >=2, "You jerk, don't try it."
    def _combinate(index, candidate):
        if len(candidate) == size:
            results.add(candidate)
            return
        candidate_cats = set(x.category for x in candidate)
        for i in range(index, len(items)):
            item = items[i]
            if item.category not in candidate_cats:
                _combinate(i, candidate + (item, ))

    results = set()
    for i, item in enumerate(items[:(1-size)]):
        _combinate(i, (item, ))

    return results

I am trying to create a number of restricted permutations of a list of items. Each item has a category, and I need to find combinations of items such that each combination does not have multiple items from the same category. To illustrate, here's some sample data:

   Name      | Category
   ==========|==========
1. Orange    | fruit
2. Apple     | fruit
3. GI-Joe    | toy
4. VCR       | electronics
5. Racquet   | sporting goods

The combinations would be restricted to length three, I don't need every combination of every length. So a set of combinations for the above list could be:

(Orange, GI-Joe, VCR)
(Orange, GI-Joe, Racquet)
(Orange, VCR, Racquet)
(Apple,  GI-Joe, VCR)
(Apple,  GI-Joe, Racquet)
... and so on.

I do this fairly often, on various lists. The lists will never be more than 40 items in length, but understandably that could create thousands of combinations (though there will likely be around 10 unique categories per each list, restricting it somewhat)

I've come up with some pseudo-python for how I would implement it recursively. It's been too long since I took combinatorics, but from what I recall this is essentially a subset of the combinations of the set, something like C(list length, desired size). There's likely some library modules which can make this cleaner (or at least more performant)

I was wondering if perhaps there was a better approach than what I've got (perhaps one which uses itertools.combinations somehow):

# For the sake of this problem, let's assume the items are hashable so they
# can be added to a set.

def combinate(items, size=3):
    assert size >=2, "You jerk, don't try it."
    def _combinate(index, candidate):
        if len(candidate) == size:
            results.add(candidate)
            return
        candidate_cats = set(x.category for x in candidate)
        for i in range(index, len(items)):
            item = items[i]
            if item.category not in candidate_cats:
                _combinate(i, candidate + (item, ))

    results = set()
    for i, item in enumerate(items[:(1-size)]):
        _combinate(i, (item, ))

    return results

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

情泪▽动烟 2024-09-25 10:22:34

天真的方法:

#!/usr/bin/env python

import itertools

items = {
    'fruits' : ('Orange', 'Apple'),
    'toys' : ('GI-Joe', ),
    'electronics' : ('VCR', ),
    'sporting_goods' : ('Racquet', )
}

def combinate(items, size=3):
    if size > len(items):
        raise Exception("Lower the `size` or add more products, dude!")

    for cats in itertools.combinations(items.keys(), size):
        cat_items = [[products for products in items[cat]] for cat in cats]
        for x in itertools.product(*cat_items):
            yield zip(cats, x)

if __name__ == '__main__':
    for x in combinate(items):
        print x

会产生:

# ==> 
#
# [('electronics', 'VCR'), ('toys', 'GI-Joe'), ('sporting_goods', 'Racquet')]
# [('electronics', 'VCR'), ('toys', 'GI-Joe'), ('fruits', 'Orange')]
# [('electronics', 'VCR'), ('toys', 'GI-Joe'), ('fruits', 'Apple')]
# [('electronics', 'VCR'), ('sporting_goods', 'Racquet'), ('fruits', 'Orange')]
# [('electronics', 'VCR'), ('sporting_goods', 'Racquet'), ('fruits', 'Apple')]
# [('toys', 'GI-Joe'), ('sporting_goods', 'Racquet'), ('fruits', 'Orange')]
# [('toys', 'GI-Joe'), ('sporting_goods', 'Racquet'), ('fruits', 'Apple')]

Naive approach:

#!/usr/bin/env python

import itertools

items = {
    'fruits' : ('Orange', 'Apple'),
    'toys' : ('GI-Joe', ),
    'electronics' : ('VCR', ),
    'sporting_goods' : ('Racquet', )
}

def combinate(items, size=3):
    if size > len(items):
        raise Exception("Lower the `size` or add more products, dude!")

    for cats in itertools.combinations(items.keys(), size):
        cat_items = [[products for products in items[cat]] for cat in cats]
        for x in itertools.product(*cat_items):
            yield zip(cats, x)

if __name__ == '__main__':
    for x in combinate(items):
        print x

Will yield:

# ==> 
#
# [('electronics', 'VCR'), ('toys', 'GI-Joe'), ('sporting_goods', 'Racquet')]
# [('electronics', 'VCR'), ('toys', 'GI-Joe'), ('fruits', 'Orange')]
# [('electronics', 'VCR'), ('toys', 'GI-Joe'), ('fruits', 'Apple')]
# [('electronics', 'VCR'), ('sporting_goods', 'Racquet'), ('fruits', 'Orange')]
# [('electronics', 'VCR'), ('sporting_goods', 'Racquet'), ('fruits', 'Apple')]
# [('toys', 'GI-Joe'), ('sporting_goods', 'Racquet'), ('fruits', 'Orange')]
# [('toys', 'GI-Joe'), ('sporting_goods', 'Racquet'), ('fruits', 'Apple')]
舂唻埖巳落 2024-09-25 10:22:34

您想要生成的是从集合中获取的元素的笛卡尔 乘积 类别

划分为多个集合相对容易:

item_set[category].append(item)

通过正确的实例化(例如) collections.defaultdict item_set[category] ,然后 itertools.product 将为您提供所需的输出。

What you seek to generate is the Cartesian product of elements taken from set of category.

The partitioning into multiple sets is relatively easy:

item_set[category].append(item)

With proper instantiation (e.g.) collections.defaultdict for item_set[category] and then itertools.product will give you the desired output.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文