Python:将名称列表划分为大小相等的子列表

发布于 2024-10-16 08:44:15 字数 775 浏览 4 评论 0原文

我有一个名字列表,例如 ['Agrajag', 'Colin', 'Deep Thought', ... , 'Zaphod Beeblebrox', 'Zarquon']。现在我想将此列表划分为大小大致相等的子列表,以便子组的边界位于名称的第一个字母处,例如 AF、GL、MP、QZ,而不是 A-Fe、Fi-Mo、Mu-Pra ,预Z。

我只能提出一个静态大小的分区,不考虑子组的大小:

import string, itertools

def _group_by_alphabet_key(elem):
    char = elem[0].upper()
    i = string.ascii_uppercase.index(char)
    if i > 19:
        to_c = string.ascii_uppercase[-1];
        from_c = string.ascii_uppercase[20]
    else:
        from_c = string.ascii_uppercase[i/5*5]
        to_c = string.ascii_uppercase[i/5*5 + 4]
    return "%s - %s" % (from_c, to_c)

subgroups = itertools.groupby(name_list, _group_by_alphabet_key)

有更好的想法吗?

PS:这可能听起来有点像家庭作业,但它实际上是一个网页,其中成员应显示在 5-10 个大小相同的组的选项卡中。

I have a list of names, e.g. ['Agrajag', 'Colin', 'Deep Thought', ... , 'Zaphod Beeblebrox', 'Zarquon']. Now I want to partition this list into approximately equally sized sublists, so that the boundaries of the subgroups are at the first letter of the names, e.g A-F, G-L, M-P, Q-Z, not A-Fe, Fi-Mo, Mu-Pra, Pre-Z.

I could only come up with a statically sized parition that doesn't take size of the subgroups into account:

import string, itertools

def _group_by_alphabet_key(elem):
    char = elem[0].upper()
    i = string.ascii_uppercase.index(char)
    if i > 19:
        to_c = string.ascii_uppercase[-1];
        from_c = string.ascii_uppercase[20]
    else:
        from_c = string.ascii_uppercase[i/5*5]
        to_c = string.ascii_uppercase[i/5*5 + 4]
    return "%s - %s" % (from_c, to_c)

subgroups = itertools.groupby(name_list, _group_by_alphabet_key)

Any better ideas?

P.S.: this may sound somewhat like homework, but it actually is for a webpage where members should be displayed in 5-10 tabs of equally sized groups.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

鱼忆七猫命九 2024-10-23 08:44:15

这是可能有用的东西。我确信有一种更简单的方法......可能涉及 itertools。请注意,num_pages 仅粗略地确定您实际获得的页面数。

编辑:哎呀!有一个错误——它切断了最后一组!下面的内容应该是固定的,但请注意,最后一页的长度会稍微难以预测。另外,我添加了 .upper() 来考虑可能的小写名称。

EDIT2:以前定义 letter_groups 的方法效率低下;下面基于字典的代码更具可扩展性:

names = ['Agrajag', 'Colin', 'Deep Thought', 'Ford Prefect' , 'Zaphod Beeblebrox', 'Zarquon']
num_pages = 3

def group_names(names, num_pages):
    letter_groups = defaultdict(list)
    for name in names: letter_groups[name[0].upper()].append(name)
    letter_groups = [letter_groups[key] for key in sorted(letter_groups.keys())]
    current_group = []
    page_groups = []
    group_size = len(names) / num_pages
    for group in letter_groups:
        current_group.extend(group)
        if len(current_group) > group_size:
            page_groups.append(current_group)
            current_group = []
    if current_group: page_groups.append(current_group)

    return page_groups

print group_names(names, num_pages)

Here's something that might work. I feel certain there's a simpler way though... probably involving itertools. Note that num_pages only roughly determines how many pages you'll actually get.

EDIT: Whoops! There was a bug -- it was cutting off the last group! The below should be fixed, but note that the length of the last page will be slightly unpredictable. Also, I added .upper() to account for possible lowercase names.

EDIT2: The previous method of defining letter_groups was inefficient; the below dict-based code is more scalable:

names = ['Agrajag', 'Colin', 'Deep Thought', 'Ford Prefect' , 'Zaphod Beeblebrox', 'Zarquon']
num_pages = 3

def group_names(names, num_pages):
    letter_groups = defaultdict(list)
    for name in names: letter_groups[name[0].upper()].append(name)
    letter_groups = [letter_groups[key] for key in sorted(letter_groups.keys())]
    current_group = []
    page_groups = []
    group_size = len(names) / num_pages
    for group in letter_groups:
        current_group.extend(group)
        if len(current_group) > group_size:
            page_groups.append(current_group)
            current_group = []
    if current_group: page_groups.append(current_group)

    return page_groups

print group_names(names, num_pages)
孤独患者 2024-10-23 08:44:15

由于您的 name_list 必须经过排序才能使 groupby 正常工作,难道您不能只检查每个第 N 个值并以这种方式构建您的划分吗?

right_endpoints = name_list[N-1::N]

使用 "A" 作为最左边的端点,使用 "Z" 作为最右边的端点,您可以相应地构建 N 个分区,并且它们都应该具有相同的大小。

  1. 因此,第一个左端点将是“A”,第一个右端点将是 right_endpoints[0]
  2. 下一个左端点将是 right_endpoints[0] 之后的字符,下一个右端点将是 right_endpoints[1]
  3. 依此类推,直到到达第 N 个范围并且其端点为“Z”。

您可能遇到的问题是,如果其中两个 right_endpoints 相同怎么办...

编辑: 示例

>>> names = ['Aaron', 'Abel', 'Cain', 'Daniel', 'Darius', 'David', 'Ellen', 'Gary', 'James', 'Jared', 'John', 'Joseph', 'Lawrence', 'Michael', 'Nicholas', 'Terry', 'Victor', 'Zulu']
>>> right_ends, left_ends = names[2::3], names[3::3]
>>> left_ends = ['A'] + left_ends
>>> left_ends, right_ends
>>> ["%s - %s" % (left, right) for left, right in zip(left_ends, right_ends)]
['A - Cain', 'Daniel - David', 'Ellen - James', 'Jared - Joseph', 'Lawrence - Nicholas', 'Terry - Zulu']

Since your name_list has to be sorted for groupby to work, can't you just check every Nth value and build your divisions that way?

right_endpoints = name_list[N-1::N]

And using "A" as your leftmost endpoint and "Z" as your rightmost endpoint, you can construct the N divisions accordingly and they should all have the same size.

  1. So, the first left endpoint would be "A", the first right endpoint would be right_endpoints[0].
  2. The next left endpoint would be the character after right_endpoints[0], the next right endpoint would be right_endpoints[1].
  3. Etc., until you hit the Nth range and that has a set endpoint of "Z".

The issue you may run into is what if two of these right_endpoints are the same...

edit: example

>>> names = ['Aaron', 'Abel', 'Cain', 'Daniel', 'Darius', 'David', 'Ellen', 'Gary', 'James', 'Jared', 'John', 'Joseph', 'Lawrence', 'Michael', 'Nicholas', 'Terry', 'Victor', 'Zulu']
>>> right_ends, left_ends = names[2::3], names[3::3]
>>> left_ends = ['A'] + left_ends
>>> left_ends, right_ends
>>> ["%s - %s" % (left, right) for left, right in zip(left_ends, right_ends)]
['A - Cain', 'Daniel - David', 'Ellen - James', 'Jared - Joseph', 'Lawrence - Nicholas', 'Terry - Zulu']
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文