使用 PDO 获取大型结果集时节省内存

发布于 2024-09-08 10:24:34 字数 667 浏览 5 评论 0原文

我用 PHP 编写了一个数据库复制工具。它工作正常,但有一个问题:

我使用 PDO 连接到不同的数据库,以使其独立于任何特定的 RDBMS,这对于该应用程序至关重要。

该工具对表进行一些分析,以决定如何转换某些类型和其他一些内容。然后它几乎执行“SELECT * FROM”来获取需要复制的行。结果集相当大(某些表中大约有 50k 行)。

之后,它使用 PDOStatement::fetch();while 循环中迭代结果集,进行一些类型转换和转义,构建一个 INSERT 语句并将其提供给目标数据库。

除了一个例外,所有这些都运行良好。当从结果集中一次获取一行时,PHP 进程不断消耗越来越多的内存。我的假设是,PDO 将已处理的行保留在内存中,直到处理整个结果集。

我还观察到,当我的工具完成一个表并继续处理下一个表时,内存消耗立即下降,这支持了我的理论。

我不会将数据保存在 PHP 变量中!我在任何给定时刻只保留一行进行处理,所以这不是问题。

现在的问题是:有没有办法强制 PDO 不将所有数据保留在内存中?我一次只处理一行,所以完全没有必要保留所有这些垃圾。我真的很想在这件事上使用更少的内存。

I have written a tool for database replication in PHP. It's working fine but there's one issue:

I'm using PDO to connect to the different databases to keep it independent of any specific RDBMS, which is crucial to this application.

The tool does some analysis on the tables to decide how to convert certain types and some other stuff. Then it pretty much does a "SELECT * FROM <tablename>" to get the rows that need to be copied. The result sets are fairly large (about 50k rows in some tables).

After that it iterates over the result set in a while loop with PDOStatement::fetch();, does some type conversion and escaping, builds an INSERT statement and feeds that to the target database.

All this is working nicely with one exception. While fetching the rows, one ata time, from the result set, the PHP process keeps eating up more and more memory. My assuption is, that PDO keeps the already processed rows in memory until the whole result set is processed.

I also abserved that, when my tool is finished with one table and proceeds to the next, memory consumption drops instantly, which supports my theory.

I'm NOT keeping the data in PHP variables! I hold just one single row at any given moment for processing, so that's not the problem.

Now to the question: Is there a way to force PDO not to keep all the data in memory? I only process one row at a time, so there's absolutely no need to keep all that garbage. I'd really like to use less memory on this thing.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

风渺 2024-09-15 10:24:34

我相信问题来自 php 的垃圾收集器,因为它没有足够快地进行垃圾收集。
我会尝试以 row_count 大小的块获取结果,例如 MySQL 中的 "SELCT ... LIMIT offset, row_count""SELECT * FROM ( SELECT ...) WHERE ROW_NUM BETWEEN offset AND (offset + row_count)" 在 ORACLE 中。
使用 Zend_Db_Select 可以生成与数据库无关的查询:

$select = $db->select()
    ->from(array('t' => 'table_name'),
        array('column_1', 'column_2'))
    ->limit($row_count, $offset);
$select->__toString(); 
# on MySQL renders: SELECT column_1, column_2 FROM table_name AS t LIMIT 10, 20

I believe the problem comes from php's garbage collector, as it does not garbage collect soon enough.
I would try to fetch my results in chunks of row_count size, like "SELCT ... LIMIT offset, row_count" in MySQL, or "SELECT * FROM (SELECT ...) WHERE ROW_NUM BETWEEN offset AND (offset + row_count)" in ORACLE.
Using Zend_Db_Select one can generate DB-independent queries:

$select = $db->select()
    ->from(array('t' => 'table_name'),
        array('column_1', 'column_2'))
    ->limit($row_count, $offset);
$select->__toString(); 
# on MySQL renders: SELECT column_1, column_2 FROM table_name AS t LIMIT 10, 20
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文