如何构建表以快速搜索大量列

发布于 2024-12-05 17:23:11 字数 1239 浏览 5 评论 0原文

我有一个包含大量列 (~60) 的表，最终将包含大量行 (~10 000)，并且我需要能够同时有效地搜索多个列值。我不确定搜索是否会完全匹配（LIKE 'value'，而不是 LIKE '%value%'），尽管 LIKE 'value %' 可能是一个可以接受的折衷方案。

已经提出了一些解决方案。我对数据库设计原则不是很熟悉，所以对我来说哪个是最好的并不明显：

对每列单独建立索引。用户将能够搜索任意列组合，因此不再需要更复杂的索引。数据库上的读取次数将多于写入次数，因此写入速度下降应该不是问题。
制作另一个表格用于搜索，如下所示：
```
obj_id col_num col_name col_value
------------------------------------------------    
1 1 '名字' '乔'    
1 2 '工作' '工程师'    
2 1 '姓名' '账单'
```
等等。我认为 col_num 和 col_name 列是多余的，但是大概一个比另一个更好。我不知道这是什么称为，虽然听起来像 Entity-Attribute-Value 模型（另请参阅此问题）。据我所知，与 EAV 模型的主要区别在于该表不会疏;所有实体都将具有大部分或全部属性。
在第一个表上为反向索引创建另一个表。理论上我知道如何做到这一点，但这将是一项巨大的工作量。此外，我们可能会丢失有关每个数据来自哪一列的信息，这不太好。另外，这感觉对于解决方案 1 来说是多余的，但我实际上不知道如何创建表索引。

这些是我们迄今为止提出的解决方案。如果相关的话，我们正在使用 Oracle 数据库，这并不是真正可选的，但我有权以任何必要的方式重构数据库。那么，这里最好的解决方案是什么？当然，“以上都不是”是一个完全可以接受的答案。这些表实际上还不存在，所以没有什么可以擦除和重新制作的。

谢谢！

原文

I have a table with a large number of columns (~60), which will eventually have a large number of rows (~10 000), and I'm going to need to be able to search efficiently on several column values at once. I'm not sure whether the searches will be exact-match (LIKE 'value', and not LIKE '%value%'), although LIKE 'value%' might be an acceptable compromise.

A few solutions have been proposed. I'm not very familiar with database design principles, so it's not obvious to me which is the best:

Index on every column individually. The users will be able to search on any combination of columns, so no more complicated indexes will work. There will be a lot more reads than writes on the database, so the write-speed slowdown shouldn't be a problem.
Make another table just for searching that looks like this:
```
obj_id  col_num  col_name  col_value
-------------------------------------    
1       1        'name'    'joe'    
1       2        'job'     'engineer'    
2       1        'name'    'bill'
```
etc. I think the col_num and col_name columns are redundant, but
presumably one is better than another. I have no idea what this is
called, although it sounds like the Entity-Attribute-Value
model (see also this question). From what I can tell, the
main difference from an EAV model is that this table would not be
sparse; all entities will have most or all attributes.
Make another table for an inverted index on the first table. I know how to do this in theory, but it would be a huge amount of work. Also, we'd probably lose information about which column each datum is from, which is not great. Also also, this feels like it would be redundant with solution 1, but I don't actually know how table indexes are created.

Those are the solutions that we have come up with so far. If it's relevant, we're using an Oracle db, which is not really optional, but I have the permissions to refactor the database in any way necessary. So, what is the best solution here? "None of the above" is a totally acceptable answer, of course. None of these tables actually exist yet, so there's nothing to wipe out and remake.

Thanks!

分享到QQ

分享到微博