如何统计两个用户的共同邻居并计算相似度?
Monthly_connections 表包含列 calling_party、called_party、common_neighbors、neighborhood_overlap
因此该表描述了哪些用户已连接。用户相似性的衡量标准之一是邻域重叠,其定义如下:
neighborhood_overlap =(数量 两者都是邻居的节点 主叫方和 被叫方)/(被叫方的节点数 至少其中之一的邻居 主叫方或被叫方)
尝试计算两个用户的共同邻居数量我编写了以下查询:
SELECT
COUNT (*) FROM
(SELECT t1.neighborA
FROM (
SELECT called_party AS neighborA FROM monthly_connections
WHERE calling_party = '9F7334BCF9000CD68D40302DC4801E60C027A7D1'
UNION SELECT calling_party AS neighborA FROM monthly_connections
WHERE called_party = '9F7334BCF9000CD68D40302DC4801E60C027A7D1') t1
INNER JOIN (SELECT called_party AS neighborB FROM monthly_connections
WHERE calling_party = '10D149A4356E1AA3A8AF604BD992BBA141DB53D2'
UNION SELECT calling_party AS neighborB FROM monthly_connections
WHERE called_party = '10D149A4356E1AA3A8AF604BD992BBA141DB53D2') t2 ON t1.neighborA = t2.neighborB) t3
上面的查询计算用户 10D149A4356E1AA3A8AF604BD992BBA141DB53D2 和 9F7334BCF9000CD68D40302DC4801E60C027A7D1 的共同邻居数量
是编写查询来设置列公共邻居的值表中每对连接的邻域重叠
有谁知道如何编写查询来更新列 common_neighbors 和neighborhood_overlap?
对于公共邻居,我开始编写以下查询,但它不正确:
UPDATE mc SET
common_neighbors =
(SELECT COUNT (*) FROM
(SELECT t1.neighborA FROM (SELECT called_party AS neighborA FROM monthly_connections WHERE calling_party = mc.calling_party UNION SELECT calling_party AS neighborA FROM monthly_connections WHERE called_party = mc.calling_party) t1 INNER JOIN (SELECT called_party AS neighborB FROM monthly_connections WHERE calling_party = mc.called_party UNION SELECT calling_party AS neighborB FROM monthly_connections WHERE called_party = mc.called_party) t2 ON t1.neighborA = t2.neighborB) t3) FROM monthly_connections mc INNER JOIN t3 ON t3.calling_party = mc.calling_party AND t3.called_party = mc.called_party
The monthly_connections table contains columns calling_party, called_party, common_neighbors, neighborhood_overlap
So the table describes which users are connected. One of the measures for user similarity is neighborhood overlap which is defined as following:
neighborhood_overlap = (number of
nodes who are neighbors of both
calling_party and
called_party)/(number of nodes who are
neighbors of at least one of
calling_party or called_party)
Trying to calculate the number of common neighbors for two users I wrote the following query:
SELECT
COUNT (*) FROM
(SELECT t1.neighborA
FROM (
SELECT called_party AS neighborA FROM monthly_connections
WHERE calling_party = '9F7334BCF9000CD68D40302DC4801E60C027A7D1'
UNION SELECT calling_party AS neighborA FROM monthly_connections
WHERE called_party = '9F7334BCF9000CD68D40302DC4801E60C027A7D1') t1
INNER JOIN (SELECT called_party AS neighborB FROM monthly_connections
WHERE calling_party = '10D149A4356E1AA3A8AF604BD992BBA141DB53D2'
UNION SELECT calling_party AS neighborB FROM monthly_connections
WHERE called_party = '10D149A4356E1AA3A8AF604BD992BBA141DB53D2') t2 ON t1.neighborA = t2.neighborB) t3
The query above calculates the number of common neighbors of users 10D149A4356E1AA3A8AF604BD992BBA141DB53D2 and 9F7334BCF9000CD68D40302DC4801E60C027A7D1
The goal is to write the query to set the value of column common neighbors and neighborhood overlap for each pair of connection in the table
Does anyone know how to write the query to update columns common_neighbors and neighborhood_overlap?
For common neighbors I started to write the following query but it is not correct:
UPDATE mc SET
common_neighbors =
(SELECT COUNT (*) FROM
(SELECT t1.neighborA FROM (SELECT called_party AS neighborA FROM monthly_connections WHERE calling_party = mc.calling_party UNION SELECT calling_party AS neighborA FROM monthly_connections WHERE called_party = mc.calling_party) t1 INNER JOIN (SELECT called_party AS neighborB FROM monthly_connections WHERE calling_party = mc.called_party UNION SELECT calling_party AS neighborB FROM monthly_connections WHERE called_party = mc.called_party) t2 ON t1.neighborA = t2.neighborB) t3) FROM monthly_connections mc INNER JOIN t3 ON t3.calling_party = mc.calling_party AND t3.called_party = mc.called_party
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为这个查询有效(尽管可能不是那么高效)。
I think this query works (though might not be that performant).