全站多对象搜索 - 数据库设计/代码策略?

发布于 2024-10-20 02:25:14 字数 469 浏览 4 评论 0原文

我不知道如何最好地使用网站搜索组件。我有一个类似于 yelp 的用户内容网站。人们可以搜索本地地点、本地事件、本地照片、成员等。因此,如果我在搜索框中输入“Tom”,我希望搜索返回与 Tom 匹配的所有用户对象的结果。现在 Tom 这个词可以出现在任何地方,比如餐厅名称、餐厅描述、评论、某人的评论等。

因此,如果我纯粹使用规范化 SQL 来设计这个,我将需要加入大约 15 个对象表来扫描所有不同的用户对象+扫描每个表中的多个列以搜索所有字段/列。现在我不知道这是正常的做法还是有更好的方法?我见过像 Solr/Apache/Elasticsearch 这样的东西,但我不确定它们如何适合我的用例,即使我使用它们,我认为我仍然需要扫描所有 15 个表 + 30-40 列,正确吗?我的平台是php/mysql。为此还需要遵循任何编码/组件架构/数据库设计实践吗?一位朋友说我应该将所有对象合并到一张表中,但这行不通,因为您无法将照片、视频、评论、页面、个人资料等合并到一张表中,所以我不知道如何实现这一点。

I am lost on how to best approach the site search component. I have a user content site similar to yelp. People can search for local places, local events, local photos, members, etc. So if i enter "Tom" in the search box I expect the search to return results from all user objects that match with Tom. Now the word Tom can be anywhere, like a restaurant name or in the description of the restaurant or in the review, or in someone's comment, etc.

So if i design this purely using normalized sql I will need to join about 15 object tables to scan all the different user objects + scan multiple colunms in each table to search all the fields/colunms. Now I dont know if this is how it is done normally or is there a better way? I have seen stuff like Solr/Apache/Elasticsearch but I am not sure how these fit in to myusecase and even if i use these I assume i still need to scan all the 15 tables + 30-40 colunms correct? My platform is php/mysql. Also any coding / component architecture / DB design practice to follow for this? A friend said i should combine all objects into 1 table but that wont work as you cant combine photos, videos, comments, pages, profiles, etc into 1 table so I am lost on how to implement this.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

嘦怹 2024-10-27 02:25:14

也许您的朋友想要将所有可搜索字段合并到一张表中。

基本思想是创建一个充当索引的表。一列是可索引的并存储单词,而另一列包含对对象的引用列表,这些对象在这些字段之一中包含该单词(例如,对象可能是图片,其可搜索字段可能是标题和评论)。

引用列表可以通过多种方式存储,因此您可以使用可变长度的字符串(例如 BLOB),并在其中存储 ids 和引用列表的 JSON 编码数组。类型的对象,以便您可以通过在与对象类型对应的表中搜索该 id 来轻松找到它们)。

当然,在添加/删除/修改可索引数据时,您应该相应地更新索引(但您可以使用延迟更新技术最终在后台更新索引 - 这是因为大多数人期望这种索引的一种实现是 Apache Cassandra,但我不会将它用于小型项目,因为在小型项目中不需要分布式数据库等)。

Probably your friend meant combining all the searchable fields into one table.

The basic idea would be to create a table that acts as the index. One column is indexable and stores words, whereas the other column contains a list of references to objects that contain that word in one of those fields (for example, an object may be a picture, and its searchable fields might be title and comments).

The list of references can be stored in many ways, so you could for example have string of variable length, say a BLOB, and in it store a JSON-encoded array of the ids & types of objects, so that you could easily find them afterwards by doing a search for that id in the table corresponding to the type of object).

Of course, on any addition / removal / modification of indexable data, you should update your index accordingly (but you can use lazy update techniques that eventually update the index in the background - that is because most people expect indexes to be accurate within maybe a few minutes to the current state of the data. One implementation of such an index is Apache Cassandra, but I wouldn't use it for small-scale projects, where you don't need distributed databases and such).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文