计算用户的重要性或“之间中心性”;来自用户的关注者?

发布于 2024-12-27 08:09:55 字数 553 浏览 5 评论 0原文

我想知道如何找到用户帐户之间有趣的关系,例如根据与其他人的联系最密切的或最有价值的用户。

下面是我使用的两个表。一个拥有所有用户,另一个拥有他们关注的用户的密钥。

User
{
    id,
    name
}

Follows {
    user_id -> user.id,
    following_id -> user.id
}

我正在寻找什么类型的算法?

假设不重要的人很少或没有追随者,我如何找到图表中心的人?我认为他们很重要,因为他们有重要的人追随他们。

更新

正如 David 和 Steve 指出的那样,给定节点的距离有多近、哪些节点形成子社区以及哪些用户联系最紧密,这些都是可以从此模式中提取的有用数据的示例。

由于现在许多网站都使用这种“追随者”设计,因此我开始提供赏金,希望获得一些可靠的 SQL 或编程语言实现,这些实现可能对各种各样的人有用。

值得注意的是,虽然某些算法的结果令人着迷,但其他算法(例如查找相关节点)对我们网站的用户来说是有价值的,因为我们可以向他们推荐东西。

I want to know how I can find interesting relationships between users accounts such as the most connected, or most valuable users based on their connections to others.

Below I have the two tables I use. One has all the users, the other has the keys of the users they follow.

User
{
    id,
    name
}

Follows {
    user_id -> user.id,
    following_id -> user.id
}

What type of algorithms am I looking for?

Assuming unimportant people have little or no followers, how can I find the people in the center of the graph? I would assume they would be important because they have important people following them.

Update

As David and Steve point out, how close given nodes are, what nodes form sub communities, and which users are the most connected are all examples of useful data that can be pulled from this schema.

Since this "follower" design is used by many sites now, I've started a bounty in the hopes of getting some solid SQL or programming language implementations that might be useful to a wide variety of people.

It's worth noting that while the results of some algorithms are fascinating, others (such as finding related nodes) would have worth to the users of our sites as we can recommend things to them.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

二智少女 2025-01-03 08:09:55

如果您只关注链接,请尝试这些流行的中心性度量(假设 G 是图):

  1. :节点 i 的度定义为 ki/(N-1),其中 ki 是节点 i 的链接数,N 是节点总数。更高的学位意味着重要。
  2. 紧密度:节点i的紧密度定义为(N-1)/(Σ_(j∈G) dij),其中 dij 是节点 i 和节点 j 之间的距离。这强调了社交网络中节点与所有其他节点的距离。
  3. Betweenness:介数定义为 (Σ_(j<kεG) njk(i) / njk) / ((N-1)(N-2)),其中njk表示最短路径的数量节点 j 和之间knjk(i) 是经过节点 i 的这些路径的数量。节点i介数较高意味着节点i可能是一个很好的中心,任何其他两个节点之间都有很多连接需要经过节点i >。

仅通过链接信息就可以轻松计算出上述度量,并且您可以使用这些中心性度​​量中的一种或组合来找出社交网络中的重要节点。不管怎样,根据“重要”的定义,你可能需要其他不同的措施。

If you only concentrate on the links, try these popular centrality measures (assume G is the graph):

  1. Degree: Degree of node i is defined as ki/(N-1), where ki is the number of links to node i and N is the total number of nodes. Higher degree means important.
  2. Closeness: Closeness of node i is defined as (N-1)/(Σ_(j∈G) dij), where dij is the distance between node i and node j. This emphasizes on the distances of a node to all others nodes in the social network.
  3. Betweenness: Betweenness defined as (Σ_(j<k∈G) njk(i) / njk) / ((N-1)(N-2)), where njk denotes the number of shortest paths between nodes j and k, and njk(i) is the number of these paths running through node i. Betweenness of node i is higher means node i may be a good center that there are many connections between any other two nodes need to pass through node i.

Above measures can be easily calculated by only the link information, and you can use one or combine more of these centrality measures to find out the important node(s) in the social network. Anyway, according to the definition of "important", you may need other different measures.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文