将列表分割成 n 个几乎等长的分区

发布于 2024-08-29 10:17:27 字数 715 浏览 7 评论 0原文

我正在寻找一种快速、干净、Pythonic 的方法来将列表精确地划分为 n 个几乎相等的分区。

partition([1,2,3,4,5],5)->[[1],[2],[3],[4],[5]]
partition([1,2,3,4,5],2)->[[1,2],[3,4,5]] (or [[1,2,3],[4,5]])
partition([1,2,3,4,5],3)->[[1,2],[3,4],[5]] (there are other ways to slice this one too)

这里有几个答案迭代列表切片，它们的运行非常接近我想要的，除了他们关注列表的大小，而我关心列表的数量（其中一些还用 None 填充）。显然，这些都经过了简单的转换，但我正在寻找最佳实践。

同样，人们在这里指出了很好的解决方案如何将列表分割成大小均匀的块？对于一个非常相似的问题，但我对分区的数量比具体大小更感兴趣，只要它在 1 之内。同样，这很简单敞篷车，但我正在寻找最佳实践。

原文

I'm looking for a fast, clean, pythonic way to divide a list into exactly n nearly-equal partitions.

partition([1,2,3,4,5],5)->[[1],[2],[3],[4],[5]]
partition([1,2,3,4,5],2)->[[1,2],[3,4,5]] (or [[1,2,3],[4,5]])
partition([1,2,3,4,5],3)->[[1,2],[3,4],[5]] (there are other ways to slice this one too)

There are several answers in here Iteration over list slices that run very close to what I want, except they are focused on the size of the list, and I care about the number of the lists (some of them also pad with None). These are trivially converted, obviously, but I'm looking for a best practice.

Similarly, people have pointed out great solutions here How do you split a list into evenly sized chunks? for a very similar problem, but I'm more interested in the number of partitions than the specific size, as long as it's within 1. Again, this is trivially convertible, but I'm looking for a best practice.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

我不咬妳我踢妳 2024-09-05 10:17:27

只是不同的做法，只有在您的示例中 [[1,3,5],[2,4]] 是可接受的分区时才有效。

def partition ( lst, n ):
    return [ lst[i::n] for i in xrange(n) ]

这满足@Daniel Stutzbach 的示例中提到的示例：

partition(range(105),10)
# [[0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100],
# [1, 11, 21, 31, 41, 51, 61, 71, 81, 91, 101],
# [2, 12, 22, 32, 42, 52, 62, 72, 82, 92, 102],
# [3, 13, 23, 33, 43, 53, 63, 73, 83, 93, 103],
# [4, 14, 24, 34, 44, 54, 64, 74, 84, 94, 104],
# [5, 15, 25, 35, 45, 55, 65, 75, 85, 95],
# [6, 16, 26, 36, 46, 56, 66, 76, 86, 96],
# [7, 17, 27, 37, 47, 57, 67, 77, 87, 97],
# [8, 18, 28, 38, 48, 58, 68, 78, 88, 98],
# [9, 19, 29, 39, 49, 59, 69, 79, 89, 99]]

Just a different take, that only works if [[1,3,5],[2,4]] is an acceptable partition, in your example.

def partition ( lst, n ):
    return [ lst[i::n] for i in xrange(n) ]

This satisfies the example mentioned in @Daniel Stutzbach's example:

partition(range(105),10)
# [[0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100],
# [1, 11, 21, 31, 41, 51, 61, 71, 81, 91, 101],
# [2, 12, 22, 32, 42, 52, 62, 72, 82, 92, 102],
# [3, 13, 23, 33, 43, 53, 63, 73, 83, 93, 103],
# [4, 14, 24, 34, 44, 54, 64, 74, 84, 94, 104],
# [5, 15, 25, 35, 45, 55, 65, 75, 85, 95],
# [6, 16, 26, 36, 46, 56, 66, 76, 86, 96],
# [7, 17, 27, 37, 47, 57, 67, 77, 87, 97],
# [8, 18, 28, 38, 48, 58, 68, 78, 88, 98],
# [9, 19, 29, 39, 49, 59, 69, 79, 89, 99]]

回复收藏 0 原文

寻找我们的幸福 2024-09-05 10:17:27

这是一个与 Daniel 类似的版本：它尽可能均匀地划分，但将所有较大的分区放在开头：

def partition(lst, n):
    q, r = divmod(len(lst), n)
    indices = [q*i + min(i, r) for i in xrange(n+1)]
    return [lst[indices[i]:indices[i+1]] for i in xrange(n)]

它还避免使用浮点运算，因为这总是让我感到不舒服。 :)

编辑：一个例子，只是为了展示与 Daniel Stutzbach 的解决方案的对比

>>> print [len(x) for x in partition(range(105), 10)]
[11, 11, 11, 11, 11, 10, 10, 10, 10, 10]

Here's a version that's similar to Daniel's: it divides as evenly as possible, but puts all the larger partitions at the start:

def partition(lst, n):
    q, r = divmod(len(lst), n)
    indices = [q*i + min(i, r) for i in xrange(n+1)]
    return [lst[indices[i]:indices[i+1]] for i in xrange(n)]

It also avoids the use of float arithmetic, since that always makes me uncomfortable. :)

Edit: an example, just to show the contrast with Daniel Stutzbach's solution

>>> print [len(x) for x in partition(range(105), 10)]
[11, 11, 11, 11, 11, 10, 10, 10, 10, 10]

回复收藏 0 原文

表情可笑 2024-09-05 10:17:27

def partition(lst, n):
    division = len(lst) / float(n)
    return [ lst[int(round(division * i)): int(round(division * (i + 1)))] for i in xrange(n) ]

>>> partition([1,2,3,4,5],5)
[[1], [2], [3], [4], [5]]
>>> partition([1,2,3,4,5],2)
[[1, 2, 3], [4, 5]]
>>> partition([1,2,3,4,5],3)
[[1, 2], [3, 4], [5]]
>>> partition(range(105), 10)
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [11, 12, 13, 14, 15, 16, 17, 18, 19, 20], [21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31], [32, 33, 34, 35, 36, 37, 38, 39, 40, 41], [42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52], [53, 54, 55, 56, 57, 58, 59, 60, 61, 62], [63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73], [74, 75, 76, 77, 78, 79, 80, 81, 82, 83], [84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94], [95, 96, 97, 98, 99, 100, 101, 102, 103, 104]]

Python 3 版本：

def partition(lst, n):
    division = len(lst) / n
    return [lst[round(division * i):round(division * (i + 1))] for i in range(n)]

def partition(lst, n):
    division = len(lst) / float(n)
    return [ lst[int(round(division * i)): int(round(division * (i + 1)))] for i in xrange(n) ]

>>> partition([1,2,3,4,5],5)
[[1], [2], [3], [4], [5]]
>>> partition([1,2,3,4,5],2)
[[1, 2, 3], [4, 5]]
>>> partition([1,2,3,4,5],3)
[[1, 2], [3, 4], [5]]
>>> partition(range(105), 10)
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [11, 12, 13, 14, 15, 16, 17, 18, 19, 20], [21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31], [32, 33, 34, 35, 36, 37, 38, 39, 40, 41], [42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52], [53, 54, 55, 56, 57, 58, 59, 60, 61, 62], [63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73], [74, 75, 76, 77, 78, 79, 80, 81, 82, 83], [84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94], [95, 96, 97, 98, 99, 100, 101, 102, 103, 104]]

Python 3 version:

def partition(lst, n):
    division = len(lst) / n
    return [lst[round(division * i):round(division * (i + 1))] for i in range(n)]

回复收藏 0 原文

挖个坑埋了你 2024-09-05 10:17:27

下面是一种方法。

def partition(lst, n):
    increment = len(lst) / float(n)
    last = 0
    i = 1
    results = []
    while last < len(lst):
        idx = int(round(increment * i))
        results.append(lst[last:idx])
        last = idx
        i += 1
    return results

如果 len(lst) 不能被 n 整除，则此版本将以大致相等的间隔分配额外的项目。例如：

>>> print [len(x) for x in partition(range(105), 10)]
[11, 10, 11, 10, 11, 10, 11, 10, 11, 10]

如果您不介意所有 11 都位于开头或结尾，则代码可能会更简单。

Below is one way.

def partition(lst, n):
    increment = len(lst) / float(n)
    last = 0
    i = 1
    results = []
    while last < len(lst):
        idx = int(round(increment * i))
        results.append(lst[last:idx])
        last = idx
        i += 1
    return results

If len(lst) cannot be evenly divided by n, this version will distribute the extra items at roughly equal intervals. For example:

>>> print [len(x) for x in partition(range(105), 10)]
[11, 10, 11, 10, 11, 10, 11, 10, 11, 10]

The code could be simpler if you don't mind all of the 11s being at the beginning or the end.

回复收藏 0 原文

寒尘 2024-09-05 10:17:27

这个答案为人们提供了一个函数split(list_, n, max_ratio)
想要将列表分成 n 块，最多使用 max_ratio
片长之比。它允许比
提问者的“片段长度最多有 1 个差异”。

它的工作原理是在所需比率范围内对 n 件长度进行采样
[1 , max_ratio)，将它们依次放置以形成“破碎”
坚持“断点”之间的距离正确，但错误
总长度。将折断的棍子缩放到所需的长度可以得到
我们想要的断点的大致位置。获取整数
断点需要后续舍入。

不幸的是，四舍五入可能会使作品变得太短，
并让你超过max_ratio。请参阅此答案的底部
例子。

import random

def splitting_points(length, n, max_ratio):
    """n+1 slice points [0, ..., length] for n random-sized slices.

    max_ratio is the largest allowable ratio between the largest and the
    smallest part.
    """
    ratios = [random.uniform(1, max_ratio) for _ in range(n)]
    normalized_ratios = [r / sum(ratios) for r in ratios]
    cumulative_ratios = [
        sum(normalized_ratios[0:i])
        for i in range(n+1)
    ]
    scaled_distances = [
        int(round(r * length))
        for r in cumulative_ratios
    ]

    return scaled_distances


def split(list_, n, max_ratio):
    """Slice a list into n randomly-sized parts.

    max_ratio is the largest allowable ratio between the largest and the
    smallest part.
    """

    points = splitting_points(len(list_), n, ratio)

    return [
        list_[ points[i] : points[i+1] ]
        for i in range(n)
    ]

您可以像这样尝试一下：

for _ in range(10):
    parts = split('abcdefghijklmnopqrstuvwxyz', 4, 2)
    print([(len(part), part) for part in parts])

不良结果的示例：

parts = split('abcdefghijklmnopqrstuvwxyz', 10, 2)

# lengths range from 1 to 4, not 2 to 4
[(3, 'abc'),  (3, 'def'), (1, 'g'),
 (4, 'hijk'), (3, 'lmn'), (2, 'op'),
 (2, 'qr'),  (3, 'stu'),  (2, 'vw'),
 (3, 'xyz')]

This answer provides a function split(list_, n, max_ratio), for people
who want to split their list into n pieces with at most max_ratio
ratio in piece length. It allows for more variation than the
questioner's 'at most 1 difference in piece length'.

It works by sampling n piece lengths within the desired ratio range
[1 , max_ratio), placing them after each other to form a 'broken
stick' with the right distances between the 'break points' but the wrong
total length. Scaling the broken stick to the desired length gives us
the approximate positions of the break points we want. To get integer
break points requires subsequent rounding.

Unfortunately, the roundings can conspire to make pieces just too short,
and let you exceed the max_ratio. See the bottom of this answer for an
example.

import random

def splitting_points(length, n, max_ratio):
    """n+1 slice points [0, ..., length] for n random-sized slices.

    max_ratio is the largest allowable ratio between the largest and the
    smallest part.
    """
    ratios = [random.uniform(1, max_ratio) for _ in range(n)]
    normalized_ratios = [r / sum(ratios) for r in ratios]
    cumulative_ratios = [
        sum(normalized_ratios[0:i])
        for i in range(n+1)
    ]
    scaled_distances = [
        int(round(r * length))
        for r in cumulative_ratios
    ]

    return scaled_distances


def split(list_, n, max_ratio):
    """Slice a list into n randomly-sized parts.

    max_ratio is the largest allowable ratio between the largest and the
    smallest part.
    """

    points = splitting_points(len(list_), n, ratio)

    return [
        list_[ points[i] : points[i+1] ]
        for i in range(n)
    ]

You can try it out like so:

for _ in range(10):
    parts = split('abcdefghijklmnopqrstuvwxyz', 4, 2)
    print([(len(part), part) for part in parts])

Example of a bad result:

parts = split('abcdefghijklmnopqrstuvwxyz', 10, 2)

# lengths range from 1 to 4, not 2 to 4
[(3, 'abc'),  (3, 'def'), (1, 'g'),
 (4, 'hijk'), (3, 'lmn'), (2, 'op'),
 (2, 'qr'),  (3, 'stu'),  (2, 'vw'),
 (3, 'xyz')]

回复收藏 0 原文

~没有更多了~