哪个更好地控制状态和执行查询? 1 个 TINYINT 列、1 个 BIT(8) 列或 8 个 BIT(1) 列
我假装使用位图设置状态(就像这个人) 并对我的表进行按位查询。 我应该使用什么列类型?我将如何执行选择?
这篇文章让我有点担心关于这个想法的低谷。我希望能够对字段进行索引、进行联接以及我对普通字段执行的所有其他操作。
因此,如果我有一个包含以下行的表:
|1234 5678|
|Id|Name|State |
|01| xxx|0111 0001|
|02| yyy|1101 1001|
|03| zzz|0101 0011|
我想取回以下行:
StateColumn 234 = 101 and StateColumn 8 = 1
That would be => (0101 0001)
我应该取回 Id 02 和 03 的行。
进行此类搜索是个好主意还是我只是疯了?
I pretend to use bitmaps set state (like this guy) and make bitwise queries on my tables.
What column types should I use? And how would I perform the selects?
This article got me a little woried about going trough with this idea. I want to be able to index the fields, do joins and everything else I would do with a normal field.
So if I have a table with the lines:
|1234 5678|
|Id|Name|State |
|01| xxx|0111 0001|
|02| yyy|1101 1001|
|03| zzz|0101 0011|
I would want to get back the lines that:
StateColumn 234 = 101 and StateColumn 8 = 1
That would be => (0101 0001)
I should get back the lines with Id 02 and 03.
Is it a good idea to make this kind of searches or am I just crazy?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
虽然位掩码方法除了给朋友留下深刻印象之外确实还有一些用途(可能会降低存储要求),但我强烈建议不要在需要查询的数据上使用它。原因是您无法有效地对其进行索引。大多数(如果不是全部)查询都必须使用完整扫描来解决。很久以前我就对这个问题非常着迷,因为我在数据库中独自在一个太小的数据集上测试了它。添加几十万行、十几个用户,它就无法扩展。
因此,除非您有一些特殊要求,否则我建议您将每条数据放在自己的列(位或整数)中,并根据您的查询需求放置适当的索引(单列或复合列)。
(在我看来正确的)方法的“缺点”是增加存储(由于单独的索引),但除非您有数百万行,否则很难注意到。
如果由于某些原因不适合您,还有其他选项可以利用数据中的模式来建立有效的搜索结构。但它们都是有代价的(灵活性严重受限、多用户环境中的锁定问题等等)。
我的建议:将每条数据存储在它自己的列中。这就是数据库的用途,它将利用数据库的所有优势。除了最异常扭曲的情况之外,这也恰好是所有情况下表现最好的方法。
While a bitmasking approach does have some uses other than impressing friends, (may reduce storage requirements), I strongly advice against using it on data that need to be queried. The reason is that you can't index it efficiently. Most if not all queries have to be resolved using full scans. I was really burned on this one a long time ago, because I tested it on a too small data set while being alone in the database. Add a few hundred thousand rows, a dozen of users and it just doesn't scale up.
Therefore, unless you have some exceptional requirements, I advice you to put each piece of data in its own column (bit or int), along with appropriate indexes (single or compound columns) depending on your query needs.
The "downside" of the (in my opinion correct) approach is increased storage (due to separate indexes) but unless you have millions of rows it's hardly noticable.
If for some reasons that doesn't work for you, there are other options, that exploit patterns in the data to make an efficient search structure. But they all come with a price (severely limited flexibility, locking issues in multiuser environments etcetera).
My advice: Store each piece of data in it own column. This is how the database was intended to be used, and it will leverage all the benefits of a database. This also happens to be the best performing approach in all but the most exceptionally twisted circumstances.
“Do joins”意味着您希望能够选择一个表中 State 列的第 8 位与另一个表中 State 列的第 8 位相匹配的行。
不要那样做。
为每个不同的属性创建一个列。选择正确的数据类型。声明所有相关的完整性约束。为右列建立索引。
这样做,你就可以选择并加入,直到奶牛回家。
"Do joins" implies that you hope to be able to select rows where the 8th bit of the State column in one table matches the 8th bit of the state column in another table.
Don't do that.
Create a column for each distinct attribute. Pick the right data type. Declare all relevant integrity constraints. Index the right columns.
Do that, and you can select and join 'till the cows come home.