Sphinx 中如何实现“@geodist”排序/搜索?
@geodist
搜索是否使用任何类型的地理空间索引(如 R 树)来提高性能?
我对锚点恒定并且每个文档都有自己的以弧度存储的纬度/经度对的情况感兴趣。
我试图从 Sphinx 源代码中找出答案,但未能找到任何空间索引的提及。如果地理空间搜索不使用索引,那么性能如何保证?
如果没有提供关键字,Sphinx 是否会进行全面扫描?
背景:我们有一个包含 100 多个短条目的数据集。一些新添加的项目将存储纬度/经度。每天都会添加数百万条条目。我预测大约 5-10% 的新添加条目将包含位置信息。
我们的目标是为支持位置的条目实现空间搜索,用于查询“获取锚点周围 100 米半径内的所有条目”、“获取锚点周围 100 个最近的条目”(无论是否使用关键字搜索)。
一些谷歌搜索返回了此论坛帖子,建议使用基于人工网格的索引来确保性能。现在还是这样吗?
Does @geodist
search use any sort of geospatial indexes (like R-trees) for performance?
I'm interested in case when anchor is constant and each document has it's own latitude/longitude pair stored in radians.
I've tried to figure it out from Sphinx source code, but failed to find any mentions of any spatial index. If no indexes are used for geospatial search, then how is performance ensured?
Does Sphinx do a full scan if no keywords are provided?
Background: We have a dataset of 100+ millions of short entries. Some of newly added items will have latitude/longitude stored. Millions of entries are added each day. I predict that about 5-10% of newly added entries will have location information.
Our goal is to implement spatial search for location-enabled entries for queries like "get all entries in 100 meters radius around anchor point", "get 100 nearest entries around anchor point" with and without keyword search.
Some googling returned this forum thread which suggests using artificial grid-based index to ensure performance. Is this still the case?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
不,sphinx 没有任何内置的地理空间索引 - 因此是瓷砖的原因(制作基本的地理空间索引:)
它实际上只是对每一行进行球形距离计算 - 全表扫描。它相当快,因为属性都保存在内存中。
检查来源:
http://codesearch.google.com/#vqMBzkK4ih0/src/sphinxexpr.cpp&exact_package=git://github.com/squadette/sphinxsearch.git&q=cos%20sphinxsearch& ;type=cs&l=1186
sphinx 论坛上讨论此问题的最新帖子
http://sphinxsearch.com/forum/view.html?id=8644
No, sphinx does not have any inbuilt geospatial indexing - hence the reason for the tiles (to make a rudimentry geospatial index :)
It really does just do a spherical distance calculation against every row - a full-table scan. Its resonably quick, because attributes are all held in memory.
Check the source:
http://codesearch.google.com/#vqMBzkK4ih0/src/sphinxexpr.cpp&exact_package=git://github.com/squadette/sphinxsearch.git&q=cos%20sphinxsearch&type=cs&l=1186
Most recent thread discussing this on sphinx forum
http://sphinxsearch.com/forum/view.html?id=8644