需要有关 Sphinx 查询和索引配置的建议

发布于 2024-09-07 04:07:33 字数 761 浏览 2 评论 0原文

下面是我需要在 Sphinx 中索引的表 alt text

新闻有许多项目通过 news_projects 表链接到它......

我的问题是建议将所有项目名称都放在一个文本字段中以便可以搜索吗?

这是我的查询:

//query to get the project names and make all project names appear in one field separated by comma.
SELECT
  news.id
  news.headline,
  news.body,
GROUP_CONCAT(DISTINCT projects.project_name) as related_projects
FROM  news 
LEFT JOIN news_projects on news.id = news_projects.news_id
LEFT JOIN projects on news_projects.project_id = projects.id
GROUP BY news.id

然后,这将输出带有格式为“name,name2,name3”的project_name列的行,

我有点犹豫要使用它还是只是将project_id作为属性。将project_name作为文本绝对有很大帮助,因为可以搜索...

请..我需要您对此的意见..非常感谢!

below are tables I need to index in Sphinx
alt text

News has many projects linked to it via the news_projects table...

My question is it advisable to have all the project names in one text field so that it can be searched on?

This is my query :

//query to get the project names and make all project names appear in one field separated by comma.
SELECT
  news.id
  news.headline,
  news.body,
GROUP_CONCAT(DISTINCT projects.project_name) as related_projects
FROM  news 
LEFT JOIN news_projects on news.id = news_projects.news_id
LEFT JOIN projects on news_projects.project_id = projects.id
GROUP BY news.id

This will then output rows with project_name column formatted as 'name,name2,name3'

Im a bit undecided weather to use this or just make the project_id an attribute.. Having the project_name as text is definitely of big help because it can be searched on...

Please.. I need you opinion on this.. thanks a lot!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

清眉祭 2024-09-14 04:07:42

您还可以使用 sql_joined_field 指定 related_projects 字段。它可能会更快,因为您可以从主查询中删除分组。

sql_query = SELECT id, headline, body FROM news

sql_joined_field = related_projects from query; \
    SELECT news_projects.news_id, projects.project_name \
    FROM news_projects \
    JOIN projects ON projects.id = news_projects.project_id \
    ORDER BY news_projects.news_id ASC

但是,如果您创建增量索引,则必须将增量条件复制到所有连接字段(与范围查询相同),这可能有点痛苦。

You can also specify related_projects field with sql_joined_field. It may be faster because you can remove grouping from the main query.

sql_query = SELECT id, headline, body FROM news

sql_joined_field = related_projects from query; \
    SELECT news_projects.news_id, projects.project_name \
    FROM news_projects \
    JOIN projects ON projects.id = news_projects.project_id \
    ORDER BY news_projects.news_id ASC

If you make delta indexes, though, you'll have to copy delta conditions to all joined fields (same with ranged queries), which can be somewhat of a pain.

三月梨花 2024-09-14 04:07:39

如果项目名称对于搜索有用,那么一定要保留它。不过,我建议使用空格而不是逗号连接(尽管当您使用默认设置时,Sphinx 的底层可能并不重要)。

将项目 ID 作为多值属性 (MVA) 可能也很有用 - 这意味着您可以将搜索结果限制为特定项目内的新闻项。真的没有理由不两者兼得。

If the project name is useful for searching on, then definitely keep it. I would recommend concatenating with a space instead of a comma, though (although it may not matter under the hood for Sphinx when you're using the default settings).

It's probably also useful to have the project ids as a Multi-Value Attribute (MVA) - which means you can limit search results to news items within a specific project. No reason to not have both, really.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文