PHP MySQL批量运行查询

发布于 2024-12-17 18:11:17 字数 237 浏览 1 评论 0原文

我是处理大量数据的新手。我想知道批量查询数据库时是否有任何最佳实践,或者是否有人可以提供任何建议。

我有一个查询将提取所有数据,并使用 PHP 将数据写入 XML 文件。数据可能在 10 到 500,000 行之间,因此我编写了脚本以 50 行为一组提取数据,写入文件,然后获取接下来的 50 行,将其附加到文件等。这样可以吗?或者我应该做点别的事情?我可以增加批处理大小还是应该减少它以使脚本运行得更快?

任何建议将不胜感激。

I am new to working with large amounts of data. I am wondering if there are any best practices when querying a database in batches or if anyone can give any advice.

I have a query that will pull out all data and PHP is used to write the data to an XML file. There can be anywhere between 10 and 500,000 rows of data and I have therefore witten the script to pull the data out in batches of 50, write to the file, then get the next 50 rows, append this to the file etc. Is this OK or should I be doing something else? Could I increase the batch size or should I decrease it to make the script run faster?

Any advice would be much appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

亚希 2024-12-24 18:11:17

是的,为了获得巨大的结果,建议使用批处理(性能和内存原因)。

这是批量运行查询的基准和示例代码

Yes, for huge results it is recommended to use batches (performance and memory reasons).

Here is benchmark and example code of running query in batches

静待花开 2024-12-24 18:11:17

做到这一点的最佳方法取决于几个不同的因素。最重要的是创建此 XML 文件的时间和原因。

如果您按需创建 XML 文件,并且用户正在等待该文件,那么您将需要进行一些微调和性能测试。

如果它是定期创建的东西,可能是每晚或每小时的任务,然后在构建后请求 XML 文件(类似于 RSS 提要构建器),那么如果您拥有的东西有效,我建议不要弄乱它。

就性能而言,有很多不同的东西可以提供帮助。在脚本中放入一些简单的计时器,并计算每批的记录数,看看是否存在任何性能差异。

$start = microtime(true);
//process batch
$end = microtime(true);
$runTimeMilliseconds = $end - $start;

如果问题是用户反馈,您可以考虑使用 AJAX 来启动每个批次并向用户报告进度。如果您向用户提供反馈,他们通常会乐意等待比仅仅等待页面整体刷新更长的时间。

另外,检查您的 SQL 查询以确保其中没有隐藏的性能损失。 http://dev.mysql.com/doc/refman/5.0/en /explain.html EXPLAIN 可以向您展示 MySQL 如何处理您的查询。

在极端情况下,我认为可以通过并行处理来实现最佳性能。我还没有在 PHP 中使用过它,但这是主要参考 http: //www.php.net/manual/en/refs.fileprocess.process.php

根据您的托管环境,您可以找到记录总数并将其拆分到子进程中。每个构建自己的 XML 片段。然后你可以组合这些片段。因此进程 1 可以处理记录 0 到 99,进程 2 可以处理 100 到 199 等。

The best way to do this depends on a couple of different things. Most importantly is when and why you are creating this XML file.

If you are creating the XML file on demand, and a user is waiting for the file then you'll need to do some fine tuning and testing for performance.

If it's something that's created on a regular basis, maybe a nightly or hourly task, and then the XML file is requested after it's built (something like an RSS feed builder) then if what you have works I would recommend not messing with it.

As far as performance, there are different things that can help. Put in some simple timers into your scripts and play with the number of records per batch and see if there is any performance differences.

$start = microtime(true);
//process batch
$end = microtime(true);
$runTimeMilliseconds = $end - $start;

If the issue is user feedback, you may consider using AJAX to kick off each batch and report progress to the user. If you give the user feedback, they'll usually be happy to wait longer than if they're just waiting on the page to refresh in whole.

Also, check your SQL query to make sure there's no hidden performance penalties there. http://dev.mysql.com/doc/refman/5.0/en/explain.html EXPLAIN can show you how MySQL goes about processing your queries.

At an extreme, I'd imagine the best performance could be accomplished through parallel processing. I haven't worked with it in PHP, but here's the primary reference http://www.php.net/manual/en/refs.fileprocess.process.php

Depending on your hosting environment you could find the total number of records and split it among sub processes. Each building their own XML fragments. Then you could combine the fragments. So process 1 may handle records 0 to 99, process 2 100 to 199, etc.

旧城烟雨 2024-12-24 18:11:17

解决这个问题的最佳方法是将其安排为 CRON 作业,我认为这是 PHP 中批处理的最佳解决方案。检查此链接以获取更多信息! PHP 中的批处理。希望这有帮助。

The best way to go about this is to schedule it as a CRON job, which i think is the best solution for batch processing in PHP. check this link for more info! Batch Processing in PHP. Hope this helps.

┾廆蒐ゝ 2024-12-24 18:11:17

您会惊讶地发现,一个简单的无限制全选是最快的,
因为它只查询数据库一次,
其他一切都在本地处理

$sql = select all_columns from table;

<?php
// set a very high memory

// query without limit, if can avoid sorting is the best

// iterate mysql result, and set it to an array
// $results[] = $row
// free mysql_result

// write xml for every one thousand
// because building xml is consuming MOST memory
for ($i=0; $i<$len; ++$i)
{
  $arr = $results[$i];
  // do any xml preparation

  // dun forget file-write is expensive too
  if ($i%1000 == 0 && $i > 0)
  {
    // write to file
  }
}
?>

You would be surprised ONE simple select all without limit is the fastest,
because it only query database once,
everything else is processed locally

$sql = select all_columns from table;

<?php
// set a very high memory

// query without limit, if can avoid sorting is the best

// iterate mysql result, and set it to an array
// $results[] = $row
// free mysql_result

// write xml for every one thousand
// because building xml is consuming MOST memory
for ($i=0; $i<$len; ++$i)
{
  $arr = $results[$i];
  // do any xml preparation

  // dun forget file-write is expensive too
  if ($i%1000 == 0 && $i > 0)
  {
    // write to file
  }
}
?>
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文