使用 PDO 获取大型结果集时节省内存
我用 PHP 编写了一个数据库复制工具。它工作正常,但有一个问题:
我使用 PDO 连接到不同的数据库,以使其独立于任何特定的 RDBMS,这对于该应用程序至关重要。
该工具对表进行一些分析,以决定如何转换某些类型和其他一些内容。然后它几乎执行“SELECT * FROM
”来获取需要复制的行。结果集相当大(某些表中大约有 50k 行)。
之后,它使用 PDOStatement::fetch();
在 while
循环中迭代结果集,进行一些类型转换和转义,构建一个 INSERT 语句并将其提供给目标数据库。
除了一个例外,所有这些都运行良好。当从结果集中一次获取一行时,PHP 进程不断消耗越来越多的内存。我的假设是,PDO 将已处理的行保留在内存中,直到处理整个结果集。
我还观察到,当我的工具完成一个表并继续处理下一个表时,内存消耗立即下降,这支持了我的理论。
我不会将数据保存在 PHP 变量中!我在任何给定时刻只保留一行进行处理,所以这不是问题。
现在的问题是:有没有办法强制 PDO 不将所有数据保留在内存中?我一次只处理一行,所以完全没有必要保留所有这些垃圾。我真的很想在这件事上使用更少的内存。
I have written a tool for database replication in PHP. It's working fine but there's one issue:
I'm using PDO to connect to the different databases to keep it independent of any specific RDBMS, which is crucial to this application.
The tool does some analysis on the tables to decide how to convert certain types and some other stuff. Then it pretty much does a "SELECT * FROM <tablename>
" to get the rows that need to be copied. The result sets are fairly large (about 50k rows in some tables).
After that it iterates over the result set in a while
loop with PDOStatement::fetch();
, does some type conversion and escaping, builds an INSERT
statement and feeds that to the target database.
All this is working nicely with one exception. While fetching the rows, one ata time, from the result set, the PHP process keeps eating up more and more memory. My assuption is, that PDO keeps the already processed rows in memory until the whole result set is processed.
I also abserved that, when my tool is finished with one table and proceeds to the next, memory consumption drops instantly, which supports my theory.
I'm NOT keeping the data in PHP variables! I hold just one single row at any given moment for processing, so that's not the problem.
Now to the question: Is there a way to force PDO not to keep all the data in memory? I only process one row at a time, so there's absolutely no need to keep all that garbage. I'd really like to use less memory on this thing.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我相信问题来自 php 的垃圾收集器,因为它没有足够快地进行垃圾收集。
我会尝试以
row_count
大小的块获取结果,例如 MySQL 中的"SELCT ... LIMIT offset, row_count"
或"SELECT * FROM ( SELECT ...) WHERE ROW_NUM BETWEEN offset AND (offset + row_count)"
在 ORACLE 中。使用 Zend_Db_Select 可以生成与数据库无关的查询:
I believe the problem comes from php's garbage collector, as it does not garbage collect soon enough.
I would try to fetch my results in chunks of
row_count
size, like"SELCT ... LIMIT offset, row_count"
in MySQL, or"SELECT * FROM (SELECT ...) WHERE ROW_NUM BETWEEN offset AND (offset + row_count)"
in ORACLE.Using Zend_Db_Select one can generate DB-independent queries: