按 int 排序与按 float 排序有速度差异吗?

发布于 2024-12-03 21:22:37 字数 84 浏览 3 评论 0原文

检索数据库中的条目时,在 SELECT 语句中使用 ORDERBY 时,将值存储为浮点数或小数与 int 之间有区别吗?

When retrieving entries in a database, is there a difference between storing values as a float or decimal vs. an int when using ORDERBY in a SELECT statement?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

岁月苍老的讽刺 2024-12-10 21:22:37

(已编辑)由于 intfloat 在磁盘上占据完全相同的空间,当然在内存中 - 即 32 位 - 唯一的区别在于它们的方式已处理

int 排序应该比 float 更快,因为比较更简单:处理器可以在一个机器周期内比较 int,但必须“解释”float 的位才能得到比较之前的值(不确定有多少个周期,但可能不止一个,尽管某些 CPU 可能对浮点比较有特殊支持)。

(Edited) Since both int and float occupy exactly the same space on disk, and of course in memory - ie 32 bits - the only differences are in the way they are processed.

int should be faster to sort than float, because the comparison is simpler: Processors can compare ints in one machine cycle, but a float's bits have to be "interpreted" to get a value before comparing (not sure how many cycles, but probably more than one, although some CPUs may have special support for float comparison).

漫漫岁月 2024-12-10 21:22:37

这取决于。您没有指定 RDBMS,因此我只能专门与 SQL Server 交谈,但数据类型具有与之相关的不同存储成本。 整数范围为 1 到 8 个字节,小数 为 5-17 和 浮点数为 4 到 8 个字节。

RDBMS 需要从磁盘读取数据页来查找数据(最坏的情况),并且它们只能容纳 8k 数据页上的这么多行。因此,如果您有 17 字节的小数,则每次读取从磁盘读取的行数将是正确调整数据大小并使用具有 1 字节成本的tinyint 来存储 X 时获得的行数的 1/17。

当您对数据进行排序(排序)时,该存储成本将产生级联效应。它会尝试在内存中进行排序,但如果您有大量行并且内存不足,它可能会转储到临时存储中进行排序,而您将一遍又一遍地付出这一成本。

索引可能会有所帮助,因为数据可以以排序的方式存储,但同样,如果将数据放入内存对于肥胖数据类型可能不那么有效。

[编辑]

@Bohemian 对整数与浮点比较的 CPU 效率提出了一个很好的观点,但令人惊讶的是,数据库服务器上的 CPU 峰值很少见。您更有可能受到磁盘 IO 子系统和内存的限制,这就是为什么我的答案重点关注将数据输入引擎以执行排序操作与比较的 CPU 成本之间的速度差异。

It depends. You didn't specify the RDBMS so I can only speak to SQL Server specifically but data types have different storage costs associated with them. Ints range from 1 to 8 bytes, Decimals are 5-17 and floats are 4 to 8 bytes.

The RDBMS will need to read data pages off disk to find your data (worst case) and they can only fit so many rows on an 8k page of data. So, if you have 17 byte decimals, you're going to get 1/17th the amount of rows read off disk per read than you could have if you sized your data correctly and used a tinyint with a 1 byte cost to store X.

That storage cost will have a cascading effect when you go to sort (order by) your data. It will attempt to sort in memory but if you have a bazillion rows and are starved for memory it may dump to temp storage for the sort and you're paying that cost over and over.

Indexes may help as the data can be stored in a sorted manner but again, if getting that data into memory may not be as efficient for obese data types.

[edit]

@Bohemian makes a fine point about the CPU efficiency of integer vs floating point comparisons but it is amazingly rare for the CPU to be spiked on a database server. You are far more likely to be constrained by the disk IO subsystem and memory which is why my answer focuses on the speed difference between getting that data into the engine for it to perform the sort operation vs the CPU cost of comparison.

涫野音 2024-12-10 21:22:37

一般来说,数据类型的选择应取决于数据类型是否适合存储需要存储的值。如果给定的数据类型不充分,那么它的效率有多高都无关紧要。

就磁盘 I/O 而言,速度差异是二阶的。在您的设计对于一阶效应而言良好之前,不要担心二阶效应。

当查询可以按排序顺序检索时,正确的索引设计将大大减少延迟。但是,加快查询速度是以减慢其他进程(例如修改索引数据的进程)为代价的。必须考虑这种权衡是否值得。

简而言之,在担心会使磁盘 I/O 增加 10% 的事情之前,先担心会使磁盘 I/O 增加一倍或更糟的事情

In general, the choice of datatypes should be driven by whether the datatype is appropriate for storing the values that are required to be stored. If a given datatype is inadequate, it doesn't matter how efficient it is.

In terms of disk i/o the speed difference is second order. Don't worry about second order effects until your design is good with regard to first order effects.

Correct index design will result in a huge decrease in delays when a query can be retrieved in sorted order to begin with. However, speeding up that query is done at the cost of slowing down other processes, like processes that modify the indexed data. The trade off has to be considered to see whether it's worth it.

In short, worry about the stuff that's going to double your disk i/o or worse before you worry about the stuff that's going to add 10% to your disk i/o

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文