推特实时搜索
我试图对 Twitter-Live Search 进行逆向工程。也许我们可以在这里讨论一下。我正在谈论的功能是,推文甚至会显示到“1秒前”等
- 。当索引(更新)发生时。该层是 MySQL 还是其他缓存层(memcached、cassandra)?也许...
- 索引-索引更新会如何发生?他们不可能从头开始建立一个新索引吗?
- 索引-这里必须有一个分布式索引。如何更新所有索引,而不必从一个索引提供陈旧数据?对方的最新数据?
- 索引-或者如果发生这样的事情有什么关系吗?老实说,我不这么认为:)哪个用户会注意到......
任何人都有任何有趣的事情要添加/讨论。我只是想理解...
I was trying to reverse engineer Twitter-Live Search. Maybe we could discuss it here. I am talking about the feature where Tweets are shown even latest to "1 sec ago" etc. Trying to understand how the following might happen -
- There must be some layer between when the user tweets & when the index (updates) happen. Is this layer MySQL or some other caching layer (memcached, cassandra)? Maybe...
- Indexing - How might the index updates be happening? They can't possibly build a new index from scratch?
- Indexing - There must be a distributed index here. How to update all the Indexes without having to serve stale data from one index & latest data from the other?
- Indexing - Or does it matter if something like this happens? Honestly I don't think so :) Which user would notice...
Anybody have anything interesting to add/discuss. I am just trying to understand...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
确实很有趣,但我想这更多是一个“架构”问题,而不是一个真正的编程问题。
但仅供参考,在高可扩展性方面有很多信息: 用 twitter 标记的帖子
他们是否保留所有推文?我的猜测是他们在一段时间后就把它们扔掉,并且他们肯定不需要 ACID 属性? ..
如果我在你那里,我不会相信那些时间戳:)
Interesting indeed, but I guess it's more of an "architecture" question, and not really a programming question.
But FYI there's a lot of information at high scalability: posts tagged with twitter
Do they keep all tweets? My guess is they just throw them away after a while, and surely they don't need ACID properties? ..
And I wouldn't trust those timestamps if I where you :)