执行语句是否总是占用内存来存储结果集?

发布于 2024-10-01 15:13:56 字数 334 浏览 11 评论 0原文

一位同事告诉我,执行 SQL 语句总是由数据库服务器将数据放入 RAM/交换区。因此选择大型结果集是不切实际的。

我认为这样的代码

my $sth = $dbh->prepare('SELECT million_rows FROM table');
while (my @data = $sth->fetchrow) {
    # process the row
}

会逐行检索结果集,而不会将其加载到 RAM 中。 但我在 DBI 或 MySQL 文档中找不到任何对此的引用。结果集是如何真正创建和检索的?对于简单的选择和连接来说,它的工作方式是否相同?

I was told by a colleague that executing an SQL statement always puts the data into RAM/swap by the database server. Thus it is not practical to select large result sets.

I thought that such code

my $sth = $dbh->prepare('SELECT million_rows FROM table');
while (my @data = $sth->fetchrow) {
    # process the row
}

retrieves the result set row by row, without it being loaded to RAM.
But I can't find any reference to this in DBI or MySQL docs. How is the result set really created and retrieved? Does it work the same for simple selects and joins?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

被翻牌 2024-10-08 15:13:56

你的同事是对的。

默认情况下,perl 模块 DBD::mysql 使用 mysql_store_result,它确实读取所有 SELECT 数据并将其缓存在 RAM 中。除非您更改默认值,否则当您在 DBI 中逐行获取时,它只是从内存缓冲区中读取它们。

这通常是您想要的,除非您有非常非常大的结果集。否则,在您从 mysqld 获取最后一个数据之前,它必须保持该数据准备就绪,我的理解是,它会导致对相同行的写入阻塞(块?表?)。

请记住,现代机器拥有大量 RAM。百万行结果集通常没什么大不了的。即使每行非常大(1 KB),也只需要 1 GB RAM 加上开销。

如果您要处理数百万行 BLOB,也许您确实需要 mysql_use_result,或者您希望通过逐步使用 LIMIT x,y 来分块选择这些行。

有关详细信息,请参阅 perldoc DBD::mysql 中的 mysql_use_result 和 mysql_store_result。

Your colleague is right.

By default, the perl module DBD::mysql uses mysql_store_result which does indeed read in all SELECT data and cache it in RAM. Unless you change that default, when you fetch row-by-row in DBI, it's just reading them out of that memory buffer.

This is usually what you want unless you have very very large result sets. Otherwise, until you get the last data back from mysqld, it has to hold that data ready and my understanding is that it causes blocks on writes to the same rows (blocks? tables?).

Keep in mind, modern machines have a lot of RAM. A million-row result set is usually not a big deal. Even if each row is quite large at 1 KB, that's only 1 GB RAM plus overhead.

If you're going to process millions of rows of BLOBs, maybe you do want mysql_use_result -- or you want to SELECT those rows in chunks with progressive uses of LIMIT x,y.

See mysql_use_result and mysql_store_result in perldoc DBD::mysql for details.

时光病人 2024-10-08 15:13:56

这是不正确的(如果我们谈论的是数据库服务器本身,而不是客户端层)。

MySQL 可以缓冲整个结果集,但这并不是一定要做的,而且如果做了,也不一定在 RAM 中。

如果您使用内联视图 (SELECT FROM (SELECT …))、查询需要排序(显示为​​ using filesort)或计划,结果集将被缓冲需要创建临时表(在查询计划中显示为 using tempor)。

即使使用临时MySQL也只会在其大小不超过tmp_table中设置的限制时将表保留在内存中。当表增长超过此限制时,它将从内存转换为MyISAM并存储在磁盘上。

不过,您可以通过将 SQL_BUFFER_RESULT 指令附加到最外层的 SELECT 来显式指示 MySQL 缓冲结果集。

请参阅文档更多细节。

This is not true (if we are talking about the database server itself, not client layers).

MySQL can buffer the whole resultset, but this is not necessarily done, and if done, not necessarily in RAM.

The resultset is buffered if you are using inline views (SELECT FROM (SELECT …)), the query needs to sort (which is shown as using filesort), or the plan requires creating a temporary table (which is shown as using temporary in the query plan).

Even if using temporary, MySQL only keeps the table in memory when its size does not exceed the limit set in tmp_table. When the table grows over this limit, it is converted from memory into MyISAM and stored on disk.

You, though, may explicitly instruct MySQL to buffer the resultset by appending SQL_BUFFER_RESULT instruction to the outermost SELECT.

See the docs for more detail.

╄→承喏 2024-10-08 15:13:56

不,事情不是这样的。

数据库不会在 RAM/交换区中保存行。

然而,它会尝试,并且 mysql 在这里努力尝试,尽可能多地缓存(索引、结果等......)。您的 mysql 配置为不同类型的缓存(针对不同类型的存储引擎)提供了可用内存缓冲区的值 - 您不应该允许此缓存交换。

测试一下
底线-仅使用客户端测试它应该很容易(我不知道perl的dbi,它可能,但我怀疑它,正在做一些强制mysql在准备时加载所有内容的事情)。无论如何...测试一下:

如果您实际上在 SELECT SQL_NO_CACHE Million_rows FROM table 上发出准备,然后仅从数百万行中获取几行。
然后,您应该将性能与SELECT SQL_NO_CACHE only_fetched_rows FROM table进行比较,看看效果如何。
如果性能相当(并且速度很快),那么我相信您可以揭穿您同事的虚张声势。

另外,如果您启用实际发送到 mysql 的语句的日志,并给我们一份记录,那么我们(非 perl 人员)可以就 mysql 的作用给出更明确的答案。

No, that is not how it works.

Database will not hold rows in RAM/swap.

However, it will try, and mysql tries hard here, to cache as much as possible (indexes, results, etc...). Your mysql configuration gives values for the available memory buffers for different kinds of caches (for different kinds of storage engines) - you should not allow this cache to swap.

Test it
Bottom line - it should be very easy to test this using client only (I don't know perl's dbi, it might, but I doubt it, be doing something that forces mysql to load everything on prepare). Anyway... test it:

If you actually issue a prepare on SELECT SQL_NO_CACHE million_rows FROM table and then fetch only few rows out of millions.
You should then compare performance with SELECT SQL_NO_CACHE only_fetched_rows FROM table and see how that fares.
If the performance is comparable (and fast) then I believe that you can call your colleague's bluff.

Also if you enable log of the statements actually issued to mysql and give us a transcript of that then we (non perl folks) can give more definitive answer on what would mysql do.

残月升风 2024-10-08 15:13:56

我对此不是很熟悉,但在我看来,DBD::mysql 可以根据 mysql_use_result 属性预先获取所有内容,也可以仅根据需要获取所有内容。请参阅 DBD::mysql 和 MySQL 文档。

I am not super familiar with this, but it looks to me like DBD::mysql can either fetch everything up front or only as needed, based on the mysql_use_result attribute. Consult the DBD::mysql and MySQL documentation.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文