按主题从 Twitter 用户构建网络图

发布于 2024-12-18 14:08:08 字数 870 浏览 2 评论 0原文

我正在尝试构建一个提及特定主题的 Twitter 用户的社交网络图。我的策略大致如下:

  1. 在 twitter 上查询某个主题。收集出现的前 100 条推文并将这些用户添加到图表中。
  2. 对于每个用户:
  3. 检索朋友和关注者。
  4. 向每个朋友/关注者查询该主题。如果他们得出结果(意味着他们已经讨论了该主题),请将其添加到图表中。
  5. 对于添加到图表中的每个用户,返回到步骤 2,直到达到所需的搜索深度。

我的问题有两个方面。首先,这种方法很快超出了我的搜索 API 速率限制。即使搜索深度为 2,我也很可能会找到拥有 100 多个朋友/关注者的人,并且在达到速率限制之前我无法查询所有这些人。

其次,这一切都需要相当长的时间。 Twitter API 并不快。假设我不受速率限制,我可以异步提交请求,但我不禁想知道是否有更有效的方法。

我尝试将请求聚合到每个搜索深度的一个查询中: topic AND from:name1 OR from:name2 .... OR from:namei

这基本上爆炸了。我从 Twitter API 收到连接重置错误。如果我将查询复制到 Twitter 网页中,它只会停留一段时间,然后显示“加载推文似乎需要一段时间”。

我还通过电子邮件发送了 [email protected] 寻求建议/增加访问权限,但是到目前为止还没有回应。

如果有人对如何通过 Twitter API 收集此类信息有任何建议,我将非常感激。我目前正在使用 twitter4j 和 java。

I'm trying to construct a social network graph of twitter users who have mentioned a particular topic. My strategy to do this goes roughly like this:

  1. Query twitter for a topic. Collect the first 100 tweets that come up and add those users to the graph.
  2. For each user:
  3. Retrieve friends and followers.
  4. Query each friend/follower for the topic. If they turn up a result (meaning they've discussed the topic), add them to the graph.
  5. For each user that was added to the graph, return to step 2 until the desired search depth is reached.

My problem is two-fold. First of all, this approach quickly exceeds my search API rate limit. Even with a search depth of 2, it's quite likely that I'll find people with 100+ friends/followers and I am unable to query them all before hitting the rate limit.

Secondly, this all takes quite awhile. Twitter API is not fast. In the hypothetical event that I was not rate limited, I could submit the requests asynchronously, but I can't help wondering if there is a more efficient way.

I've tried aggregating the requests into one query per search depth:
topic AND from:name1 OR from:name2 .... OR from:namei

This basically explodes. I get a connection reset error from the twitter API. If I copy the query into the twitter web page, it just sits for awhile and then says "loading tweets seems to be taking awhile."

I also emailed [email protected] to ask for suggestions / access increase, but no response so far.

If anyone has any suggestions on how to go about gathering this type of information through the twitter API, I would very much appreciate it. I am currently using twitter4j and java.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

梦里寻她 2024-12-25 14:08:08

您是否尝试过仅对主题使用过滤流,并使用提及和转发构建图表?这是相当间接的,并且仍然会很慢,但不会达到任何速率限制。

请参阅 http://truthy.indiana.edu/http://cnets.indiana.edu/groups/nan/truthy

Have you tried just using a filtered stream for a topic, and building the graph using mentions and retweets? This is quite indirect, and will still be slow, but won't hit any rate limits.

See http://truthy.indiana.edu/ and http://cnets.indiana.edu/groups/nan/truthy

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文