推特实时搜索

发布于 2024-09-18 00:56:00 字数 419 浏览 4 评论 0原文

我试图对 Twitter-Live Search 进行逆向工程。也许我们可以在这里讨论一下。我正在谈论的功能是,推文甚至会显示到“1秒前”

  1. 。当索引(更新)发生时。该层是 MySQL 还是其他缓存层(memcached、cassandra)?也许...
  2. 索引-索引更新会如何发生?他们不可能从头开始建立一个新索引吗?
  3. 索引-这里必须有一个分布式索引。如何更新所有索引,而不必从一个索引提供陈旧数据?对方的最新数据?
  4. 索引-或者如果发生这样的事情有什么关系吗?老实说,我不这么认为:)哪个用户会注意到......

任何人都有任何有趣的事情要添加/讨论。我只是想理解...

I was trying to reverse engineer Twitter-Live Search. Maybe we could discuss it here. I am talking about the feature where Tweets are shown even latest to "1 sec ago" etc. Trying to understand how the following might happen -

  1. There must be some layer between when the user tweets & when the index (updates) happen. Is this layer MySQL or some other caching layer (memcached, cassandra)? Maybe...
  2. Indexing - How might the index updates be happening? They can't possibly build a new index from scratch?
  3. Indexing - There must be a distributed index here. How to update all the Indexes without having to serve stale data from one index & latest data from the other?
  4. Indexing - Or does it matter if something like this happens? Honestly I don't think so :) Which user would notice...

Anybody have anything interesting to add/discuss. I am just trying to understand...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

雪若未夕 2024-09-25 00:56:00

确实很有趣,但我想这更多是一个“架构”问题,而不是一个真正的编程问题。

但仅供参考,在高可扩展性方面有很多信息: 用 twitter 标记的帖子

他们是否保留所有推文?我的猜测是他们在一段时间后就把它们扔掉,并且他们肯定不需要 ACID 属性? ..

如果我在你那里,我不会相信那些时间戳:)

Interesting indeed, but I guess it's more of an "architecture" question, and not really a programming question.

But FYI there's a lot of information at high scalability: posts tagged with twitter

Do they keep all tweets? My guess is they just throw them away after a while, and surely they don't need ACID properties? ..

And I wouldn't trust those timestamps if I where you :)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文