如何在Python中平均分割列表块?

发布于 2025-01-16 23:29:39 字数 631 浏览 2 评论 0原文

我正在尝试使用热谱图将用户添加到我的组中 我的列表中有 200 个用户 id - python

list_of_users = [user_id1, user_id2, user_id3, user_id4, ...]

我也有一个 7 个客户端的列表,我要做的是分发,7 个客户端之间没有用户 id 列表(大约相等)并添加它们,而且我有时也有不均匀的情况用户数量那么我如何使用 python 分配列表并相应地添加用户?

顺便说一句:如果 2-3 个用户没有正确分配也没关系,就像我想分配大约。并添加它们,但任何用户都不应错过。

我尝试了这个函数 -

def divide_chunks(l, n):
    for i in range(0, len(l), n): 
        yield l[i:i + n]

但它分布不均匀,它分布特定数量的块,最后给出剩余的块,这不是我想要的。

简而言之:我希望自动决定输出并决定如何均匀分配用户 ID。

stackover flow 中的大多数答案我们必须决定我不想要的块的数量 - 我想做的就是将 x 数量的项目分配到 y 数量相等的部分

i am trying to add users to my group using pyrogram
i have 200 user ids in a list - python

list_of_users = [user_id1, user_id2, user_id3, user_id4, ...]

i also, have a list of 7 clients, what i waana do is distribute, no of list of user ids among 7 clients (approx. equally) and add them, also i sometimes have uneven number of users so how do i distribute the list add users accordingly using python?

btw : its okay if 2-3 users are not properly distributed, like i wanna distribute approx. and add them but none of the users should miss.

i tried this function -

def divide_chunks(l, n):
    for i in range(0, len(l), n): 
        yield l[i:i + n]

but it doesn't distribute evenly it distributes specific number of chuncks and at last gives remaining chunks which is not what i want.

inshort : i want the output to be autodecided and decide how to evenly distribute the user ids.

most of answer in stackover flow we have to decide no of chunks i don't wanna - all i want to do is distribute the x no of items into y no of equal parts

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

给妤﹃绝世温柔 2025-01-23 23:29:39

您可以使用:

np.array_split(list_of_users, NUMBER_OF_CLIENTS)

更多信息:文档

You can use:

np.array_split(list_of_users, NUMBER_OF_CLIENTS)

More in: Docs

↙厌世 2025-01-23 23:29:39

DIY:无需外部库

这是一种无需外部库的方法。如果可能的话,此实现将为每个客户端分配相同数量的用户。如果不是,它将确保分配给客户端的用户数量差异最大为 1(= 我对公平的定义)。此外,如果您要多次运行此命令,它将确保不会始终将其他用户分配给相同的客户端。它通过随机选择一组客户端来实现这一点,这些客户端将需要承担剩余用户之一(无法以相等的比例分配给客户端)。这确保了用户向客户端的公平分配。

我发布的代码有点多,因此这里有一些高级解释:

相关函数称为 assign_users_to_clients()。这将完成您想做的工作。另外两个函数 verify_all_users_assigned()print_mapping() 只是本演示中的实用函数。一个将确保分配正确,即用户被准确地分配给一个客户端(没有重复的分配,没有未分配的用户),另一个只是打印更好一点的结果,以便您可以验证用户到客户端的分配实际上是公平的。

import random


def verify_all_users_assigned(users, client_user_dict):
    """
    Verify that all users have indeed been assigned to a client.
    Not necessary for the algorithm but used to check whether the implementation is correct.
    :param users: list of all users that have to be assigned
    :param client_user_dict: assignment of users to clients
    :return:
    """
    users_assigned_to_clients = set()
    duplicate_users = list()

    for clients_for_users in client_user_dict.values():
        client_set = set(clients_for_users)
        # if there is an intersection those users have been assigned twice (at least)
        inter = users_assigned_to_clients.intersection(client_set)
        if len(inter) != 0:
            duplicate_users.extend(list(inter))
        # now make union of clients to know which clients have already been processed
        users_assigned_to_clients = users_assigned_to_clients.union(client_set)
    all_users = set(users)
    remaining_users = users_assigned_to_clients.difference(all_users)
    if len(remaining_users) != 0:
        print(f"Not all users have been assigned to clients. Missing are {remaining_users}")
        return
    if len(duplicate_users) != 0:
        print(f"Some users have been assigned at least twice. Those are {duplicate_users}")
        return
    print(f"All users have successfully been assigned to clients.")


def assign_users_to_clients(users, clients):
    """
    Assign users to clients.
    :param users: list of users
    :param clients: list of clients
    :return: dictionary with mapping from clients to users
    """
    users_per_client = len(users) // len(clients)
    remaining_clients = len(users) % len(clients)
    if remaining_clients != 0:
        print(
            f"An equal split is not possible! {remaining_clients} users would remain when each client takes on {users_per_client} users. Assigning remaining users to random clients.")

    # assign each client his fair share of users
    client_users = list()
    for i in range(0, len(users), users_per_client):
        # list of all clients for one user
        user_for_client = list()
        last_client = i + users_per_client
        # make sure we don't run out of bounds here
        if last_client > len(users):
            last_client = len(users)
        # run from current position (as determined by range()) to last client (as determined by the step value)
        # this will assign all users (that belong to the client's share of users) to one client
        for j in range(i, last_client):
            # assign user to client
            user_for_client.append(users[j])
        client_users.append(user_for_client)

    # Assign clients and users as determined above
    client_user_registry = {clients[i]: client_users[i] for i in range(len(clients))}
    # now we need to take care of the remaining clients
    # we could just go from back to front and assign one more user to each client but to make it fair, choose randomly without repetition
    start = users_per_client * len(clients)
    for i, client in enumerate(random.sample(clients, k=remaining_clients)):
        client_user_registry[client].append(users[start + i])
    return client_user_registry


def print_mapping(mapping):
    print("""
+-------------------------
| Mapping: User -> Client
+-------------------------""")
    for client, users in mapping.items():
        print(f" - Client: {client}\t =>\t Users ({len(users)}): {', '.join(users)}")


# users that need to be assigned
list_of_users = ["user_id1", "user_id2", "user_id3", "user_id4", "user_id5", "user_id6", "user_id7", "user_id8",
                 "user_id9", "user_id10", "user_id11",
                 "user_id12", "user_id13", "user_id14", "user_id15", "user_id16", "user_id17", "user_id18",
                 "user_id19",
                 "user_id20", "user_id21", "user_id22", "user_id23", "user_id24", "user_id25", "user_id26"]
# clients to assign users to
list_of_clients = ["client_1", "client_2", "client_3", "client_4", "client_5", "client_6", "client_7"]

# do assignment of users to clients
client_user_assignment = assign_users_to_clients(list_of_users, list_of_clients)

# verify that the algorithm works (just for demo purposes)
verify_all_users_assigned(list_of_users, client_user_assignment)

# print assignment
print_mapping(client_user_assignment)

预期产出

An equal split is not possible! 5 users would remain when each client takes on 3 users. Assigning remaining users to random clients.
All users have successfully been assigned to clients.

+-------------------------
| Mapping: User -> Client
+-------------------------
 - Client: client_1  =>  Users (4): user_id1, user_id2, user_id3, user_id23
 - Client: client_2  =>  Users (4): user_id4, user_id5, user_id6, user_id26
 - Client: client_3  =>  Users (3): user_id7, user_id8, user_id9
 - Client: client_4  =>  Users (3): user_id10, user_id11, user_id12
 - Client: client_5  =>  Users (4): user_id13, user_id14, user_id15, user_id24
 - Client: client_6  =>  Users (4): user_id16, user_id17, user_id18, user_id25
 - Client: client_7  =>  Users (4): user_id19, user_id20, user_id21, user_id22

请注意:由于 random.sample() 会随机选择一个客户端,您的结果可能会有所不同,但它始终是公平的(= 请参阅上面的公平规范)

使用外部库

时外部库有很多选择。请参阅例如函数 pandas.cut()< /a> 或 numpy.split()。当不可能将用户公平分配给客户端时,他们的行为会有所不同,因此您应该阅读文档中的内容。

DIY: Without external libraries

Here is one approach without external libraries. This implementation will assign an equal number of users to each client if possible. If not it will make sure the difference in number of users assigned to clients between clients is at max 1 (= my definition of fair). Additionally, it will make sure that additional users are not assigned to the same clients all the time, if you were to run this multiple times. It does this by randomly choosing the set of clients which will need to take on one of the remaining users (that could not be assigned to clients in equal parts). This ensures a fair allocation of users to clients.

It's a bit more code that I post, so here some high-level explanation:

The relevant function is called assign_users_to_clients(). This will do the job you intend to do. The two other functions verify_all_users_assigned() and print_mapping() are just utility functions for the sake of this demo. One will make sure the assignment is correct, i. e. users are assigned to exactly one client (no duplicate assignments, no unassigned users) and the other just prints the result a bit nicer so you can verify that the distribution of users to clients is actually fair.

import random


def verify_all_users_assigned(users, client_user_dict):
    """
    Verify that all users have indeed been assigned to a client.
    Not necessary for the algorithm but used to check whether the implementation is correct.
    :param users: list of all users that have to be assigned
    :param client_user_dict: assignment of users to clients
    :return:
    """
    users_assigned_to_clients = set()
    duplicate_users = list()

    for clients_for_users in client_user_dict.values():
        client_set = set(clients_for_users)
        # if there is an intersection those users have been assigned twice (at least)
        inter = users_assigned_to_clients.intersection(client_set)
        if len(inter) != 0:
            duplicate_users.extend(list(inter))
        # now make union of clients to know which clients have already been processed
        users_assigned_to_clients = users_assigned_to_clients.union(client_set)
    all_users = set(users)
    remaining_users = users_assigned_to_clients.difference(all_users)
    if len(remaining_users) != 0:
        print(f"Not all users have been assigned to clients. Missing are {remaining_users}")
        return
    if len(duplicate_users) != 0:
        print(f"Some users have been assigned at least twice. Those are {duplicate_users}")
        return
    print(f"All users have successfully been assigned to clients.")


def assign_users_to_clients(users, clients):
    """
    Assign users to clients.
    :param users: list of users
    :param clients: list of clients
    :return: dictionary with mapping from clients to users
    """
    users_per_client = len(users) // len(clients)
    remaining_clients = len(users) % len(clients)
    if remaining_clients != 0:
        print(
            f"An equal split is not possible! {remaining_clients} users would remain when each client takes on {users_per_client} users. Assigning remaining users to random clients.")

    # assign each client his fair share of users
    client_users = list()
    for i in range(0, len(users), users_per_client):
        # list of all clients for one user
        user_for_client = list()
        last_client = i + users_per_client
        # make sure we don't run out of bounds here
        if last_client > len(users):
            last_client = len(users)
        # run from current position (as determined by range()) to last client (as determined by the step value)
        # this will assign all users (that belong to the client's share of users) to one client
        for j in range(i, last_client):
            # assign user to client
            user_for_client.append(users[j])
        client_users.append(user_for_client)

    # Assign clients and users as determined above
    client_user_registry = {clients[i]: client_users[i] for i in range(len(clients))}
    # now we need to take care of the remaining clients
    # we could just go from back to front and assign one more user to each client but to make it fair, choose randomly without repetition
    start = users_per_client * len(clients)
    for i, client in enumerate(random.sample(clients, k=remaining_clients)):
        client_user_registry[client].append(users[start + i])
    return client_user_registry


def print_mapping(mapping):
    print("""
+-------------------------
| Mapping: User -> Client
+-------------------------""")
    for client, users in mapping.items():
        print(f" - Client: {client}\t =>\t Users ({len(users)}): {', '.join(users)}")


# users that need to be assigned
list_of_users = ["user_id1", "user_id2", "user_id3", "user_id4", "user_id5", "user_id6", "user_id7", "user_id8",
                 "user_id9", "user_id10", "user_id11",
                 "user_id12", "user_id13", "user_id14", "user_id15", "user_id16", "user_id17", "user_id18",
                 "user_id19",
                 "user_id20", "user_id21", "user_id22", "user_id23", "user_id24", "user_id25", "user_id26"]
# clients to assign users to
list_of_clients = ["client_1", "client_2", "client_3", "client_4", "client_5", "client_6", "client_7"]

# do assignment of users to clients
client_user_assignment = assign_users_to_clients(list_of_users, list_of_clients)

# verify that the algorithm works (just for demo purposes)
verify_all_users_assigned(list_of_users, client_user_assignment)

# print assignment
print_mapping(client_user_assignment)

Expected output

An equal split is not possible! 5 users would remain when each client takes on 3 users. Assigning remaining users to random clients.
All users have successfully been assigned to clients.

+-------------------------
| Mapping: User -> Client
+-------------------------
 - Client: client_1  =>  Users (4): user_id1, user_id2, user_id3, user_id23
 - Client: client_2  =>  Users (4): user_id4, user_id5, user_id6, user_id26
 - Client: client_3  =>  Users (3): user_id7, user_id8, user_id9
 - Client: client_4  =>  Users (3): user_id10, user_id11, user_id12
 - Client: client_5  =>  Users (4): user_id13, user_id14, user_id15, user_id24
 - Client: client_6  =>  Users (4): user_id16, user_id17, user_id18, user_id25
 - Client: client_7  =>  Users (4): user_id19, user_id20, user_id21, user_id22

Please note: as random.sample() chooses the clients that take on one more client randomly your result might differ, but it will always be fair (= see specification of fair above)

With external libraries

When using external libraries there are many options. See e.g. function pandas.cut() or numpy.split(). They will act differently when a fair distribution of users to clients is not possible so you should read on that in the documentation.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文