稀疏数值数据的存储(例如倒排索引)-有什么约定吗?
是否有一种可接受的方法来存储和访问稀疏数值数据(例如搜索引擎的倒排索引/文档矩阵术语)? RDBMS 似乎不适合此类数据,但最好将其存储在某种数据库中(保存到磁盘、作为服务器运行等)。 对于此类问题是否有公认的解决方案(例如能够支持此类模型的现有数据库)? 有人知道谷歌如何如此快速地存储和访问他们的索引吗?
Is there an accepted way of storing and accessing sparse numerical data (such as a search engine's inverted index / term by document matrix)? An RDBMS seems inappropriate for this kind of data, but it would be good to have it stored in some kind of database (saved to disk, running as a server, etc). Is there an accepted solution for this kind of problem (such as an existing database capable of supporting this kind of model)? Anyone know how Google stores and accesses their indexes so fast?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
请在此处查看有关 Google 的详细信息以及指向更多信息的链接。
Have a look here for more info on Google and links to more info.