需要加速这个 php 脚本
目前我的 mysql 表中有超过 400k 条记录。结构如下:
我正在使用的功能:
function cron_hour_counts()
{
$subids = get_subids();
array_push($subids, '');
$from = '2011-10-20';//last_updated_date('tb_hour_counts');
$to = '2011-10-20';//last_date();
$days = days_interval($from, $to);
$result_array = array();
foreach ($subids as $subid)
{
for ($i = 0; $i < $days; $i++)
{
$hour = '00:00';
for ($t = 0; $t <= 23; $t++)
{
if ($t == 0)
{
$chour = date('H:i', strtotime($hour . '+' . $t . 'hour'));
$phour = date('H:i', strtotime('23:59'));
$date = date('Y-m-d', strtotime($from . '+' . $i . 'day'));
$date_prev = date('Y-m-d', strtotime($date . '- 1 day'));
}
else
{
$chour = date('H:i', strtotime($hour . '+' . $t . 'hour'));
$phour = date('H:i', strtotime($chour . '-1 hour'));
$date = date('Y-m-d', strtotime($from . '+' . $i . 'day'));
$date_prev = $date;
}
$unique_id_query = mysql_query("SELECT (SELECT COUNT(DISTINCT `id`,`subid`) FROM `tb_stats` WHERE (`date` < '" . mysql_real_escape_string($date) . "') OR (`date` = '" . mysql_real_escape_string($date) . "' AND `time` <= '" . mysql_real_escape_string($chour) . "')" . (!empty($subid) && is_numeric($subid) ? " AND `subid` = '" . mysql_real_escape_string($subid) . "'" : "") . ") - (SELECT COUNT(DISTINCT `id`,`subid`) FROM `tb_stats` WHERE (`date` < '" . mysql_real_escape_string($date_prev) . "') OR (`date` = '" . mysql_real_escape_string($date_prev) . "' AND `time` <= '" . mysql_real_escape_string($phour) . "')" . (!empty($subid) && is_numeric($subid) ? " AND `subid` = '" . mysql_real_escape_string($subid) . "'" : "") . ") AS `unique_ids`");
$unique_id_result = mysql_fetch_assoc($unique_id_query);
$total_id_query = mysql_query("SELECT COUNT(DISTINCT `id`,`subid`) AS `total_ids` FROM `tb_stats` WHERE `date` = '" . mysql_real_escape_string($date) . "' AND `time` <= '" . mysql_real_escape_string($chour) . "'" . (!empty($subid) && is_numeric($subid) ? " AND `subid` = '" . mysql_real_escape_string($subid) . "'" : ""));
$total_id_result = mysql_fetch_assoc($total_id_query);
$unique_ip_query = mysql_query("SELECT (SELECT COUNT(DISTINCT `ip`,`subid`) FROM `tb_stats` WHERE (`date` < '" . mysql_real_escape_string($date) . "') OR (`date` = '" . mysql_real_escape_string($date) . "' AND `time` <= '" . mysql_real_escape_string($chour) . "')" . (!empty($subid) && is_numeric($subid) ? " AND `subid` = '" . mysql_real_escape_string($subid) . "'" : "") . ") - (SELECT COUNT(DISTINCT `ip`,`subid`) FROM `tb_stats` WHERE `date` <= '" . mysql_real_escape_string($date_prev) . "' AND `time` <= '" . mysql_real_escape_string($phour) . "'" . (!empty($subid) && is_numeric($subid) ? " AND `subid` = '" . mysql_real_escape_string($subid) . "'" : "") . ") AS `unique_ips`");
$unique_ip_result = mysql_fetch_assoc($unique_ip_query);
$total_ip_query = mysql_query("SELECT COUNT(DISTINCT `ip`,`subid`) AS `total_ips` FROM `tb_stats` WHERE `date` = '" . mysql_real_escape_string($date) . "' AND `time` <= '" . mysql_real_escape_string($chour) . "'" . (!empty($subid) && is_numeric($subid) ? " AND `subid` = '" . mysql_real_escape_string($subid) . "'" : ""));
$total_ip_result = mysql_fetch_assoc($total_ip_query);
$global_query = mysql_query("SELECT COUNT(`id`) AS `global` FROM `tb_stats` WHERE `date` = '" . mysql_real_escape_string($date) . "' AND `time` <= '" . mysql_real_escape_string($chour) . "'" . (!empty($subid) && is_numeric($subid) ? " AND `subid` = '" . mysql_real_escape_string($subid) . "'" : ""));
$global_result = mysql_fetch_assoc($global_query);
$result = array();
$result['date'] = $date;
$result['hour'] = $chour;
$result['subid'] = $subid;
$result['unique_ids'] = $unique_id_result['unique_ids'];
$result['total_ids'] = $total_id_result['total_ids'];
$result['unique_ips'] = $unique_ip_result['unique_ips'];
$result['total_ips'] = $total_ip_result['total_ips'];
$result['global'] = $global_result['global'];
$result_array[] = $result;
}
}
}
//db insert
print_r($result_array);
}
有20个subids,一天需要40分钟执行。有什么加快速度的建议吗?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这是我的解决方案。它的工作速度快了 20 倍。
This is my solution. It works 20 times faster.
优化您的查询。
下面是一个绝对可以优化的查询示例:
您在上面的评论中发布了以下查询作为 unique_ids_query 的示例:
本质上,该查询正在获取 2011 年 10 月 20 日和 2011 年之间不同 id、subid 组合的变化 - 10-19,其中补贴为“1”。为此,您首先计算 2011-10-20 日期以下的所有记录,然后计算 2011-10-19 日期以下的所有记录。该查询中还有三个 SELECT 语句。
除非我弄错了,否则这与计算 2011-10-19 和 2011-10-20 之间的所有记录相同,您可以执行以下操作:
如果可能的话,您还应该开始在 PHP 中使用 MySQLi 或 PDO 进行存储过程,这也可能是性能的助推器。
此外,您应该在单个连接上运行尽可能多的查询,以减少连接延迟(这会增加!)
最后一个潜在的好处是编写 MySQL 函数。可以在不将 COUNT 或 DISTINCT 与 MySQL 函数一起使用的情况下运行上面的查询,这将是超出将其作为函数运行所带来的性能提升的性能提升器。
Optimize your queries.
Here's an example of a query that can DEFINITELY be optimized:
You posted the following query as an example of a unique_ids_query in your comment above:
Essentially the query is getting the change in distinct id, subid combinations between 2011-10-20 and 2011-10-19 where the subid is '1'. You're doing this by first counting ALL of the records below the 2011-10-20 date and then counting ALL of the records below the 2011-10-19 date. You also have THREE SELECT statements in that query.
Unless I'm mistaken thats the same as counting all of the records between 2011-10-19 and 2011-10-20, which you could do with the following:
You should also start using MySQLi or PDO in PHP if possible for store procedures, which could also be a performance booster.
In addition you should run as many queries as possible over a single connection to reduce connection latency (it adds up!)
One final potential boon would be to write MySQL functions. It would be possible to run the query above without using COUNT or DISTINCT with a MySQL function which would be a performance booster beyond the boost given by running it as a function.