mysql sphinx生成唯一id

发布于 2024-10-15 08:07:23 字数 1140 浏览 6 评论 0原文

我将我的数据库分为 8 个部分,其中每个部分都包含带有表 user_data 的数据库,为了获得更好的搜索性能,我使用 sphinx 来索引所有这些数据,但我遇到了一个问题...因为表 user_data 没有任何唯一字段来表示每个部分行,但它是一个一对多的表,我无法正确运行我的 sphinx 索引,因为它需要唯一的 id,并且这种方式可以解决重复的 id ...知道我该如何解决这个问题吗?或者通过来自不同段的所有子索引生成唯一的ID?

示例:

SELECT user_id, item_id, info
FROM user_data

返回类似以下内容:

+----------+-----------------------+
| user_id  | item_id    | info     |
+----------+-----------------------+
| 10       | 151        | asdf     |
| 10       | 152        | test     |
| 11       | 151        | 545      |
| 12       | 151        | sdfsd    |
| 12       | 152        | eewwe    |
| 12       | 153        | dfsd     |

但我必须知道

+----------+-----------------------------+
| user_id  | item_id    | info     | id  |
+----------+-----------------------------+
| 10       | 151        | asdf     | 1   |
| 10       | 152        | test     | 2   |
| 11       | 151        | 545      | 3   |
| 12       | 151        | sdfsd    | 4   |
| 12       | 152        | eewwe    | 5   |
| 12       | 153        | dfsd     | 6   |

id 当然在所有段中必须是唯一的

I have my database segmented in 8 parts, where each part contains database with table user_data, for better search performance im using sphinx to index all those data but im come accross one problem ... since table user_data dont have any unique field to represent each row but its a 1 to many table i have trouble to run my sphinx index correctly since it requires unique id and this way resolve in duplicate ids ... any idea how can i workaround this? or generate unique id throught all sub indexes from different segments?

example:

SELECT user_id, item_id, info
FROM user_data

Which returns something like:

+----------+-----------------------+
| user_id  | item_id    | info     |
+----------+-----------------------+
| 10       | 151        | asdf     |
| 10       | 152        | test     |
| 11       | 151        | 545      |
| 12       | 151        | sdfsd    |
| 12       | 152        | eewwe    |
| 12       | 153        | dfsd     |

but i have to get

+----------+-----------------------------+
| user_id  | item_id    | info     | id  |
+----------+-----------------------------+
| 10       | 151        | asdf     | 1   |
| 10       | 152        | test     | 2   |
| 11       | 151        | 545      | 3   |
| 12       | 151        | sdfsd    | 4   |
| 12       | 152        | eewwe    | 5   |
| 12       | 153        | dfsd     | 6   |

of course id must be unique throght all segments

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

妄断弥空 2024-10-22 08:07:23

首先,您应该在操作查询之前设置来设置变量,

sql_query_pre = SET @a := 1;

然后使用此变量来获取虚构的自动增量

sql_query = SELECT @a := @a + 1 AS id, user_id, item_id, info FROM user_data

first of all you should set before action query to set your variable

sql_query_pre = SET @a := 1;

then use this variable to get fictional auto increment

sql_query = SELECT @a := @a + 1 AS id, user_id, item_id, info FROM user_data
雨后咖啡店 2024-10-22 08:07:23

我不熟悉 Sphinx,但如果您希望创建跨表唯一的 id,在您的情况下:

  1. 一种选择是使用 UUID 作为所有表的唯一索引——它们发生冲突的可能性很小。
  2. 另一种选择是,如果您知道表的最大大小,则仅使用该范围内的数字加上偏移量。例如,表 1 的 id:1 - 10000,表 2 的 id:10001 - 20000 等。您甚至可以将 id 字段设置为 AUTO_INCRMENT 并将其起始编号设置为特定范围的开始。

I'm unfamiliar with Sphinx, but if you're looking to create ids that are unique across tables, in your case:

  1. One option is to use a UUID as a unique index on all the tables -- the chances of them colliding are minute.
  2. Another option is, if you know the max size of a table, to only use numbers in that range plus an offset. E.g., Table 1's ids: 1 - 10000, Table 2's ids: 10001 - 20000, etc. You can even set the id fields to be AUTO_INCREMENT and set their start numbers at the beginning of the particular range.
记忆之渊 2024-10-22 08:07:23

您可以在索引时执行类似的操作:

SELECT user_id + 10 * 1 AS id, 1 ASegment_id, itme_id, info FROM user_data_1

...添加 segment_id。您将拥有其中 8 个,因此索引查询将类似于:

SELECT (user_id + 10) * 1 AS id, 1 AS segment_id, itme_id, info FROM user_data_1
UNION
SELECT (user_id + 10) * 2 AS id, 2 AS segment_id, itme_id, info FROM user_data_2
UNION
SELECT (user_id + 10) * 3 AS id, 3 AS segment_id, itme_id, info FROM user_data_3
UNION
SELECT (user_id + 10) * 4 AS id, 4 AS segment_id, itme_id, info FROM user_data_4
UNION
SELECT (user_id + 10) * 5 AS id, 5 AS segment_id, itme_id, info FROM user_data_5
UNION
SELECT (user_id + 10) * 6 AS id, 6 AS segment_id, itme_id, info FROM user_data_6
UNION
SELECT (user_id + 10) * 7 AS id, 7 AS segment_id, itme_id, info FROM user_data_7
UNION
SELECT (user_id + 10) * 8 AS id, 8 AS segment_id, itme_id, info FROM user_data_8

然后,当您查询 sphinx 并返回 ID 时,只需将 id 除以 segment_id 并减去 10。这样所有的 id 在 sphinx 中都是唯一的。只需确保属性类型可以处理您要索引的 id 的大小即可。

You could do something like this while indexing:

SELECT user_id + 10 * 1 AS id, 1 AS segment_id, itme_id, info FROM user_data_1

... adding a segment_id. You would have eight of these, so the indexing query would look something like:

SELECT (user_id + 10) * 1 AS id, 1 AS segment_id, itme_id, info FROM user_data_1
UNION
SELECT (user_id + 10) * 2 AS id, 2 AS segment_id, itme_id, info FROM user_data_2
UNION
SELECT (user_id + 10) * 3 AS id, 3 AS segment_id, itme_id, info FROM user_data_3
UNION
SELECT (user_id + 10) * 4 AS id, 4 AS segment_id, itme_id, info FROM user_data_4
UNION
SELECT (user_id + 10) * 5 AS id, 5 AS segment_id, itme_id, info FROM user_data_5
UNION
SELECT (user_id + 10) * 6 AS id, 6 AS segment_id, itme_id, info FROM user_data_6
UNION
SELECT (user_id + 10) * 7 AS id, 7 AS segment_id, itme_id, info FROM user_data_7
UNION
SELECT (user_id + 10) * 8 AS id, 8 AS segment_id, itme_id, info FROM user_data_8

Then when you query sphinx and get back the IDs, just undo the arithmetic by dividing the id by segment_id and subtracting 10. This way all the ids will be unique within sphinx. Just make sure the attribute type can handle the size of the ids you'll be indexing.

娇女薄笑 2024-10-22 08:07:23

正如另一个答案中建议的那样,使用 UUID。但sphinx不能使用UUID作为id。你需要一个 INT。因此使用UUID_SHORT,然后你就有一个唯一的整数(对于mysql)。如果这不能开箱即用(例如,如果您正在使用 Ubuntu-11.04),您将收到如下错误:

WARNING: DOCID_MAX document_id, skipping

您将需要使用 –enable-id64 编译 sphinx 源代码,或者只需访问 sphinx 网站并获取最新的软件包(使用 –enable-id64 编译)。 此博客条目

As proposed in another answer saying to use UUID. But sphinx can not use UUID as id. You will need an INT. Therefor use UUID_SHORT and then you have a unique integer (for mysql). If this does not work out of the box (e.g. if you aare using Ubuntu-11.04), you will get an error like this:

WARNING: DOCID_MAX document_id, skipping

You will need to compile sphinx source with –enable-id64, or just go to sphinx website and get an up to date package (which is compiled with –enable-id64). A more complete example of this indexing method is given in this blog entry

悲歌长辞 2024-10-22 08:07:23

我们使用 crc32(uuid_short()) 来实现 sphinx 的 32 位。这在大多数情况下都有效!办公室。不能依赖 32 位摘要

We are using crc32(uuid_short()) for 32 bit implementations of sphinx. This works, most of the time! Ofc. one cannot rely upon a 32 bit digest of a

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文