是否可以 O(1) 访问数据库行?

发布于 2024-10-07 10:04:00 字数 189 浏览 0 评论 0原文

我有一个使用自动增量字段(ID)作为主键的表。该表仅用于追加,不会删除任何行。表被设计为具有恒定的行大小。

因此,我预计使用任何值作为 ID 的访问时间为 O(1),因为很容易计算在文件中查找的确切位置 (ID*row_size),不幸的是事实并非如此。

我正在使用 SQL Server。
有可能吗?

谢谢

I have an table which use an auto-increment field (ID) as primary key. The table is append only and no row will be deleted. Table has been designed to have a constant row size.

Hence, I expected to have O(1) access time using any value as ID since it is easy to compute exact position to seek in file (ID*row_size), unfortunately that is not the case.

I'm using SQL Server.
Is it even possible ?

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

风苍溪 2024-10-14 10:04:00

因此,我期望拥有 O(1) 访问权限
使用任何值作为 ID 的时间,因为它是
易于计算要寻找的精确位置
在文件中(ID*row_size),

啊。不。即使没有删除,自动增量也不能保证没有漏洞。孔 = 通过索引查找。因此:你的假设是错误的。

Hence, I expected to have O(1) access
time using any value as ID since it is
easy to compute exact position to seek
in file (ID*row_size),

Ah. No. Autoincrement does not - even without deletions -guarantee no holes. Holes = seek via index. Ergo: your assumption is wrong.

半暖夏伤 2024-10-14 10:04:00

我想对你来说最重要的是表演。
数据库使用索引来访问写入磁盘上的记录。

通常这是通过 B+ 树索引来完成的,该索引为 logbn,其中内部节点的 b 通常在 100 到 200 之间(针对块大小进行了优化,请参阅 ref)

严格来说这仍然是对数性能,但考虑到相当多的记录,我们可以说一些百万,可以通过 3 到 4 个步骤到达叶节点,加上查询计划、会话启动、锁定等的所有开销(如果您需要多用户、符合 ACID 的数据管理系统,无论如何您都会有这些开销)肯定是出于与恒定时间相当的所有实际原因。

I guess the thing that matters to you is the performance.
Databases use indexes to access records which are written on the disk.

Usually this is done with B+ tree indexes, which are logbn where b for internal nodes is typically between 100 and 200 (optimized to block size, see ref)

This is still strictly speaking logarithmic performance, but given decent number of records, let's say a few million, the leaf nodes can be reached in 3 to 4 steps and that, together with all the overhead for query planning, session initiation, locking, etc (that you would have anyway if you need multiuser, ACID compliant data management system) is certainly for all practical reasons comparable to constant time.

一袭水袖舞倾城 2024-10-14 10:04:00

好消息是,索引读取的时间复杂度为 O(log(n)),对于较大的 n 值,它非常接近 O(1)。也就是说,在这种情况下,O 符号并不是很有用,实际的计时更有意义。

The good news is that an indexed read is O(log(n)) which for large values of n gets pretty close to O(1). That said in this context O notation is not very useful, and actual timings are far more meanigful.

临走之时 2024-10-14 10:04:00

即使可以直接寻址行,您的查询仍然必须通过客户端和服务器协议栈并执行各种查找和内存分配,然后才能给出您想要的结果。看起来你在期待一些不切实际的东西。这里真正的问题是什么? SQL Server 对您来说不够快吗?如果是这样,您可以使用许多选项来提高性能,但直接在文件中查找地址不是其中之一。

Even if it were possible to address rows directly, your query would still have to go through the client and server protocol stacks and carry out various lookups and memory allocations before it could give the result you want. It seems like you are expecting something that isn't even practical. What is the real problem here? Is SQL Server not fast enough for you? If so there are many options you can use to improve performance but directly seeking an address in a file is not one of them.

懒的傷心 2024-10-14 10:04:00

不可能。 SQL Server根据键和索引值将数据组织成树状结构; DB 意义上的“索引”更像是参考书的索引,而不是像数组或列表这样的索引数据结构。最好的情况是,在搜索索引值时可以获得对数性能(PK 通常被视为索引)。最坏的情况是对非索引列进行表扫描,这是线性的。在数据库变得非常大之前,针对精心设计的表进行精心设计的查询的查找时间与通过网络甚至命名管道发送查询所需的时间相比将显得苍白无力。

Not possible. SQL Server organizes data into a tree-like structure based on key and index values; an "index" in the DB sense is more like a reference book's index and not like an indexed data structure like an array or list. At best, you can get logarithmic performance when searching on an indexed value (PKs are generally treated as an index). Worst-case is a table scan for a non-indexed column, which is linear. Until the database gets very large, the seek time of a well-designed query against a well-designed table will pale in comparison to the time required to send it over the network or even a named pipe.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文