Thinking Sphinx - 当进行更改时,按字符串属性排序会不同步
我有一个“餐厅”表,其中有一个“名称”列。我定义了以下索引:
indexes "REPLACE(UPPER(restaurants.name), 'THE ', '')", :as => :restaurant_name, :sortable => true
...因为我想对餐厅名称进行排序,而不考虑前缀“The”。
我的问题是,每当这些记录之一被更新(以任何方式)时,新记录就会跳到排序顺序的顶部。如果更新了另一条记录,它也会跳到其他记录之前。我最终得到两个列表:自上次重新索引以来已更新的餐厅列表和尚未更新的餐厅列表。每个列表均按字母顺序排列,但我不明白为什么整个列表会以这种方式隔离。我确实设置了延迟增量索引,我认为问题与此有关。
I have a "restaurants" table with a "name" column. I've defined the following index:
indexes "REPLACE(UPPER(restaurants.name), 'THE ', '')", :as => :restaurant_name, :sortable => true
... because I want to sort the restaurant names without respect to the prefix "The ".
My problem is that whenever one of these records is updated (in any way) the new record jumps to the top of the sort order. If another record is updated, it also jumps ahead of the rest. I end up with two lists: a list of restaurants that have been updated since the last re-indexing and a list of those that haven't. Each respective list is in alphabetical order, but I don't understand why the overall list is getting segregated this way. I do have a delayed delta index set up, and I assume the issue is related to this.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
不幸的是,这是 sphinx 的限制。
为了按字符串排序,它构建了一个有序的值列表,然后将它们转换为表示列表中位置的整数。这种情况发生在每个索引的基础上,因此当第一个项目添加到增量索引时,您最终会得到 2 个按 1 排序的记录。下一个记录意味着您将有 2 个按 2 排序的记录,依此类推。
Unfortunately this is a limitation of sphinx.
To sort by strings it builds an ordered list of values, then converts them to an integer representing the place in the list. This happens on a per index basis, so when the first item gets added to the delta index, you end up with 2 records that sort with a 1. The next record means you'll have 2 records that sort with a 2, etc.
我通过创建自己的排序方法来处理这个问题。您可以根据字符串的某些部分创建一个整数值。我采取了一种廉价的方法,只使用第一个字符的 char 值。您可以通过创建一个转换短语中更多字符的算法来变得更复杂。当我使用另一个搜索引擎进行搜索时,我们使用了 6 个字符,并且很少出现混乱,这就是我们的整个订购方法。如果您这样做并将其作为次要订单,那么这将有助于解决您的增量问题。
I handled this by creating my own sort method. You can create an integer value based on some part of the string. I took a cheap approach and just used the char value of the first character. You could get more complex by creating an algorithm that converts more characters in the phrase. When I did search using another search engine we used 6 characters and rarely did that get out of order and that was our entire order method. If you do this and make it a secondary order by then it will help resolve your deltas issues.