需要加速这个 php 脚本

发布于 2024-12-17 19:25:44 字数 4903 浏览 1 评论 0 原文

目前我的 mysql 表中有超过 400k 条记录。结构如下:

我正在使用的功能:

function cron_hour_counts()
{
    $subids = get_subids();
    array_push($subids, '');
    $from = '2011-10-20';//last_updated_date('tb_hour_counts');
    $to = '2011-10-20';//last_date();
    $days = days_interval($from, $to);
    $result_array = array();

    foreach ($subids as $subid)
    {
        for ($i = 0; $i < $days; $i++)
        {
            $hour = '00:00';
            for ($t = 0; $t <= 23; $t++)
            {
                if ($t == 0)
                {
                    $chour = date('H:i', strtotime($hour . '+' . $t . 'hour'));
                    $phour = date('H:i', strtotime('23:59'));

                    $date = date('Y-m-d', strtotime($from . '+' . $i . 'day'));
                    $date_prev = date('Y-m-d', strtotime($date . '- 1 day'));
                }
                else
                {
                    $chour = date('H:i', strtotime($hour . '+' . $t . 'hour'));
                    $phour = date('H:i', strtotime($chour . '-1 hour'));

                    $date = date('Y-m-d', strtotime($from . '+' . $i . 'day'));
                    $date_prev = $date;
                }

                $unique_id_query = mysql_query("SELECT (SELECT COUNT(DISTINCT `id`,`subid`) FROM `tb_stats` WHERE (`date` < '" . mysql_real_escape_string($date) . "') OR (`date` = '" . mysql_real_escape_string($date) . "' AND `time` <= '" . mysql_real_escape_string($chour) . "')" . (!empty($subid) && is_numeric($subid) ? " AND `subid` = '" . mysql_real_escape_string($subid) . "'" : "") . ") - (SELECT COUNT(DISTINCT `id`,`subid`) FROM `tb_stats` WHERE (`date` < '" . mysql_real_escape_string($date_prev) . "') OR (`date` = '" . mysql_real_escape_string($date_prev) . "' AND `time` <= '" . mysql_real_escape_string($phour) . "')" . (!empty($subid) && is_numeric($subid) ? " AND `subid` = '" . mysql_real_escape_string($subid) . "'" : "") . ") AS `unique_ids`");
                $unique_id_result = mysql_fetch_assoc($unique_id_query);

                $total_id_query = mysql_query("SELECT COUNT(DISTINCT `id`,`subid`) AS `total_ids` FROM `tb_stats` WHERE `date` = '" . mysql_real_escape_string($date) . "' AND `time` <= '" . mysql_real_escape_string($chour) . "'" . (!empty($subid) && is_numeric($subid) ? " AND `subid` = '" . mysql_real_escape_string($subid) . "'" : ""));
                $total_id_result = mysql_fetch_assoc($total_id_query);

                $unique_ip_query = mysql_query("SELECT (SELECT COUNT(DISTINCT `ip`,`subid`) FROM `tb_stats` WHERE (`date` < '" . mysql_real_escape_string($date) . "') OR (`date` = '" . mysql_real_escape_string($date) . "' AND `time` <= '" . mysql_real_escape_string($chour) . "')" . (!empty($subid) && is_numeric($subid) ? " AND `subid` = '" . mysql_real_escape_string($subid) . "'" : "") . ") - (SELECT COUNT(DISTINCT `ip`,`subid`) FROM `tb_stats` WHERE `date` <= '" . mysql_real_escape_string($date_prev) . "' AND `time` <= '" . mysql_real_escape_string($phour) . "'" . (!empty($subid) && is_numeric($subid) ? " AND `subid` = '" . mysql_real_escape_string($subid) . "'" : "") . ") AS `unique_ips`");
                $unique_ip_result = mysql_fetch_assoc($unique_ip_query);

                $total_ip_query = mysql_query("SELECT COUNT(DISTINCT `ip`,`subid`) AS `total_ips` FROM `tb_stats` WHERE `date` = '" . mysql_real_escape_string($date) . "' AND `time` <= '" . mysql_real_escape_string($chour) . "'" . (!empty($subid) && is_numeric($subid) ? " AND `subid` = '" . mysql_real_escape_string($subid) . "'" : ""));
                $total_ip_result = mysql_fetch_assoc($total_ip_query);

                $global_query = mysql_query("SELECT COUNT(`id`) AS `global` FROM `tb_stats` WHERE `date` = '" . mysql_real_escape_string($date) . "' AND `time` <= '" . mysql_real_escape_string($chour) . "'" . (!empty($subid) && is_numeric($subid) ? " AND `subid` = '" . mysql_real_escape_string($subid) . "'" : ""));
                $global_result = mysql_fetch_assoc($global_query);

                $result = array();
                $result['date'] = $date;
                $result['hour'] = $chour;
                $result['subid'] = $subid;
                $result['unique_ids'] = $unique_id_result['unique_ids'];
                $result['total_ids'] = $total_id_result['total_ids'];
                $result['unique_ips'] = $unique_ip_result['unique_ips'];
                $result['total_ips'] = $total_ip_result['total_ips'];
                $result['global'] = $global_result['global'];

                $result_array[] = $result;
            }
        }
    }
    //db insert
    print_r($result_array);
}

有20个subids,一天需要40分钟执行。有什么加快速度的建议吗?

I'm currently having more than 400k records in my mysql table. The structure is the following:

table structure

The function i am using:

function cron_hour_counts()
{
    $subids = get_subids();
    array_push($subids, '');
    $from = '2011-10-20';//last_updated_date('tb_hour_counts');
    $to = '2011-10-20';//last_date();
    $days = days_interval($from, $to);
    $result_array = array();

    foreach ($subids as $subid)
    {
        for ($i = 0; $i < $days; $i++)
        {
            $hour = '00:00';
            for ($t = 0; $t <= 23; $t++)
            {
                if ($t == 0)
                {
                    $chour = date('H:i', strtotime($hour . '+' . $t . 'hour'));
                    $phour = date('H:i', strtotime('23:59'));

                    $date = date('Y-m-d', strtotime($from . '+' . $i . 'day'));
                    $date_prev = date('Y-m-d', strtotime($date . '- 1 day'));
                }
                else
                {
                    $chour = date('H:i', strtotime($hour . '+' . $t . 'hour'));
                    $phour = date('H:i', strtotime($chour . '-1 hour'));

                    $date = date('Y-m-d', strtotime($from . '+' . $i . 'day'));
                    $date_prev = $date;
                }

                $unique_id_query = mysql_query("SELECT (SELECT COUNT(DISTINCT `id`,`subid`) FROM `tb_stats` WHERE (`date` < '" . mysql_real_escape_string($date) . "') OR (`date` = '" . mysql_real_escape_string($date) . "' AND `time` <= '" . mysql_real_escape_string($chour) . "')" . (!empty($subid) && is_numeric($subid) ? " AND `subid` = '" . mysql_real_escape_string($subid) . "'" : "") . ") - (SELECT COUNT(DISTINCT `id`,`subid`) FROM `tb_stats` WHERE (`date` < '" . mysql_real_escape_string($date_prev) . "') OR (`date` = '" . mysql_real_escape_string($date_prev) . "' AND `time` <= '" . mysql_real_escape_string($phour) . "')" . (!empty($subid) && is_numeric($subid) ? " AND `subid` = '" . mysql_real_escape_string($subid) . "'" : "") . ") AS `unique_ids`");
                $unique_id_result = mysql_fetch_assoc($unique_id_query);

                $total_id_query = mysql_query("SELECT COUNT(DISTINCT `id`,`subid`) AS `total_ids` FROM `tb_stats` WHERE `date` = '" . mysql_real_escape_string($date) . "' AND `time` <= '" . mysql_real_escape_string($chour) . "'" . (!empty($subid) && is_numeric($subid) ? " AND `subid` = '" . mysql_real_escape_string($subid) . "'" : ""));
                $total_id_result = mysql_fetch_assoc($total_id_query);

                $unique_ip_query = mysql_query("SELECT (SELECT COUNT(DISTINCT `ip`,`subid`) FROM `tb_stats` WHERE (`date` < '" . mysql_real_escape_string($date) . "') OR (`date` = '" . mysql_real_escape_string($date) . "' AND `time` <= '" . mysql_real_escape_string($chour) . "')" . (!empty($subid) && is_numeric($subid) ? " AND `subid` = '" . mysql_real_escape_string($subid) . "'" : "") . ") - (SELECT COUNT(DISTINCT `ip`,`subid`) FROM `tb_stats` WHERE `date` <= '" . mysql_real_escape_string($date_prev) . "' AND `time` <= '" . mysql_real_escape_string($phour) . "'" . (!empty($subid) && is_numeric($subid) ? " AND `subid` = '" . mysql_real_escape_string($subid) . "'" : "") . ") AS `unique_ips`");
                $unique_ip_result = mysql_fetch_assoc($unique_ip_query);

                $total_ip_query = mysql_query("SELECT COUNT(DISTINCT `ip`,`subid`) AS `total_ips` FROM `tb_stats` WHERE `date` = '" . mysql_real_escape_string($date) . "' AND `time` <= '" . mysql_real_escape_string($chour) . "'" . (!empty($subid) && is_numeric($subid) ? " AND `subid` = '" . mysql_real_escape_string($subid) . "'" : ""));
                $total_ip_result = mysql_fetch_assoc($total_ip_query);

                $global_query = mysql_query("SELECT COUNT(`id`) AS `global` FROM `tb_stats` WHERE `date` = '" . mysql_real_escape_string($date) . "' AND `time` <= '" . mysql_real_escape_string($chour) . "'" . (!empty($subid) && is_numeric($subid) ? " AND `subid` = '" . mysql_real_escape_string($subid) . "'" : ""));
                $global_result = mysql_fetch_assoc($global_query);

                $result = array();
                $result['date'] = $date;
                $result['hour'] = $chour;
                $result['subid'] = $subid;
                $result['unique_ids'] = $unique_id_result['unique_ids'];
                $result['total_ids'] = $total_id_result['total_ids'];
                $result['unique_ips'] = $unique_ip_result['unique_ips'];
                $result['total_ips'] = $total_ip_result['total_ips'];
                $result['global'] = $global_result['global'];

                $result_array[] = $result;
            }
        }
    }
    //db insert
    print_r($result_array);
}

Having 20 subids, and one day period it takes 40 minutes to execute. Any tips on speeding this up?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

电影里的梦 2024-12-24 19:25:44

这是我的解决方案。它的工作速度快了 20 倍。

function cron_hour_counts()
{
    $subids = get_subids();
    //array_push($subids, '');
    $from = '2011-10-20';//last_updated_date('tb_hour_counts');
    $to = '2011-10-20';//last_date();
    $days = days_interval($from, $to);
    $result_array = array();

    for ($i = 0; $i < $days; $i++)
    {
        $hour = '00:00';
        for ($t = 0; $t <= 23; $t++)
        {
            $date = date('Y-m-d', strtotime($from . '+' . $i . 'day'));
            $currentHour = date('H:i', strtotime($hour . '+' . $t . 'hour'));
            $nextHour = date('H:i', strtotime($currentHour . '+59 minutes'));

            $unique_ids_query = mysql_query("
                SELECT COUNT(id) AS unique_ids,subid
                FROM 
                (
                    SELECT id,subid,date,time
                    FROM tb_stats
                    WHERE date <= '" . mysql_real_escape_string($date) . "'
                    GROUP BY id,subid
                ) AS id_inner
                WHERE date = '" . mysql_real_escape_string($date) . "'
                AND time BETWEEN '" . mysql_real_escape_string($currentHour) . "' AND '" . mysql_real_escape_string($nextHour) . "'
                GROUP BY subid;
            ");
            pull_data('unique_ids', $date, $currentHour, $unique_ids_query, $subids, $result_array);            

            $unique_ips_query = mysql_query("
                SELECT COUNT(ip) AS unique_ips,subid
                FROM 
                (
                    SELECT ip,subid,date,time
                    FROM tb_stats
                    WHERE date <= '" . mysql_real_escape_string($date) . "'
                    GROUP BY ip,subid
                ) AS ip_inner
                WHERE date = '" . mysql_real_escape_string($date) . "'
                AND time BETWEEN '" . mysql_real_escape_string($currentHour) . "' AND '" . mysql_real_escape_string($nextHour) . "'
                GROUP BY subid;
            ");
            pull_data('unique_ips', $date, $currentHour, $unique_ips_query, $subids, $result_array);

            $total_ids_query = mysql_query("
                SELECT COUNT(DISTINCT id,subid) AS total_ids,subid
                FROM tb_stats
                WHERE date = '" . mysql_real_escape_string($date) . "'
                AND `time` <= '" . mysql_real_escape_string($nextHour) . "'
                GROUP BY subid
                ");
            pull_data('total_ids', $date, $currentHour, $total_ids_query, $subids, $result_array);

            $total_ips_query = mysql_query("
                SELECT COUNT(DISTINCT ip,subid) AS total_ips,subid
                FROM tb_stats
                WHERE date = '" . mysql_real_escape_string($date) . "'
                AND `time` <= '" . mysql_real_escape_string($nextHour) . "'
                GROUP BY subid;
            ");
            pull_data('total_ips', $date, $currentHour, $total_ips_query, $subids, $result_array);

            $global_query = mysql_query("
                SELECT COUNT(id) AS global,subid
                FROM tb_stats
                WHERE date = '" . mysql_real_escape_string($date) . "'
                AND time <= '" . mysql_real_escape_string($nextHour) . "'
                GROUP BY subid;
            ");
            pull_data('global', $date, $currentHour, $global_query, $subids, $result_array);
        }
    }

    print_r($result_array);
}

This is my solution. It works 20 times faster.

function cron_hour_counts()
{
    $subids = get_subids();
    //array_push($subids, '');
    $from = '2011-10-20';//last_updated_date('tb_hour_counts');
    $to = '2011-10-20';//last_date();
    $days = days_interval($from, $to);
    $result_array = array();

    for ($i = 0; $i < $days; $i++)
    {
        $hour = '00:00';
        for ($t = 0; $t <= 23; $t++)
        {
            $date = date('Y-m-d', strtotime($from . '+' . $i . 'day'));
            $currentHour = date('H:i', strtotime($hour . '+' . $t . 'hour'));
            $nextHour = date('H:i', strtotime($currentHour . '+59 minutes'));

            $unique_ids_query = mysql_query("
                SELECT COUNT(id) AS unique_ids,subid
                FROM 
                (
                    SELECT id,subid,date,time
                    FROM tb_stats
                    WHERE date <= '" . mysql_real_escape_string($date) . "'
                    GROUP BY id,subid
                ) AS id_inner
                WHERE date = '" . mysql_real_escape_string($date) . "'
                AND time BETWEEN '" . mysql_real_escape_string($currentHour) . "' AND '" . mysql_real_escape_string($nextHour) . "'
                GROUP BY subid;
            ");
            pull_data('unique_ids', $date, $currentHour, $unique_ids_query, $subids, $result_array);            

            $unique_ips_query = mysql_query("
                SELECT COUNT(ip) AS unique_ips,subid
                FROM 
                (
                    SELECT ip,subid,date,time
                    FROM tb_stats
                    WHERE date <= '" . mysql_real_escape_string($date) . "'
                    GROUP BY ip,subid
                ) AS ip_inner
                WHERE date = '" . mysql_real_escape_string($date) . "'
                AND time BETWEEN '" . mysql_real_escape_string($currentHour) . "' AND '" . mysql_real_escape_string($nextHour) . "'
                GROUP BY subid;
            ");
            pull_data('unique_ips', $date, $currentHour, $unique_ips_query, $subids, $result_array);

            $total_ids_query = mysql_query("
                SELECT COUNT(DISTINCT id,subid) AS total_ids,subid
                FROM tb_stats
                WHERE date = '" . mysql_real_escape_string($date) . "'
                AND `time` <= '" . mysql_real_escape_string($nextHour) . "'
                GROUP BY subid
                ");
            pull_data('total_ids', $date, $currentHour, $total_ids_query, $subids, $result_array);

            $total_ips_query = mysql_query("
                SELECT COUNT(DISTINCT ip,subid) AS total_ips,subid
                FROM tb_stats
                WHERE date = '" . mysql_real_escape_string($date) . "'
                AND `time` <= '" . mysql_real_escape_string($nextHour) . "'
                GROUP BY subid;
            ");
            pull_data('total_ips', $date, $currentHour, $total_ips_query, $subids, $result_array);

            $global_query = mysql_query("
                SELECT COUNT(id) AS global,subid
                FROM tb_stats
                WHERE date = '" . mysql_real_escape_string($date) . "'
                AND time <= '" . mysql_real_escape_string($nextHour) . "'
                GROUP BY subid;
            ");
            pull_data('global', $date, $currentHour, $global_query, $subids, $result_array);
        }
    }

    print_r($result_array);
}
丶情人眼里出诗心の 2024-12-24 19:25:44

优化您的查询。

下面是一个绝对可以优化的查询示例:

您在上面的评论中发布了以下查询作为 unique_ids_query 的示例:

SELECT (SELECT COUNT(DISTINCT id,subid) FROM tb_stats WHERE subid = '1' AND date <= '2011-10-20') - (SELECT COUNT(DISTINCT id,subid) FROM tb_stats WHERE subid = '1' AND date <= '2011-10-19') AS unique_ids;

本质上,该查询正在获取 2011 年 10 月 20 日和 2011 年之间不同 id、subid 组合的变化 - 10-19,其中补贴为“1”。为此,您首先计算 2011-10-20 日期以下的所有记录,然后计算 2011-10-19 日期以下的所有记录。该查询中还有三个 SELECT 语句。

除非我弄错了,否则这与计算 2011-10-19 和 2011-10-20 之间的所有记录相同,您可以执行以下操作:

SELECT COUNT(DISTINCT id,subid) AS unique_ids FROM tb_stats WHERE subid = '1' AND date <= '2011-10-20' AND date >= '2011-10-19';

如果可能的话,您还应该开始在 PHP 中使用 MySQLi 或 PDO 进行存储过程,这也可能是性能的助推器。

此外,您应该在单个连接上运行尽可能多的查询,以减少连接延迟(这会增加!)

最后一个潜在的好处是编写 MySQL 函数。可以在不将 COUNT 或 DISTINCT 与 MySQL 函数一起使用的情况下运行上面的查询,这将是超出将其作为函数运行所带来的性能提升的性能提升器。

Optimize your queries.

Here's an example of a query that can DEFINITELY be optimized:

You posted the following query as an example of a unique_ids_query in your comment above:

SELECT (SELECT COUNT(DISTINCT id,subid) FROM tb_stats WHERE subid = '1' AND date <= '2011-10-20') - (SELECT COUNT(DISTINCT id,subid) FROM tb_stats WHERE subid = '1' AND date <= '2011-10-19') AS unique_ids;

Essentially the query is getting the change in distinct id, subid combinations between 2011-10-20 and 2011-10-19 where the subid is '1'. You're doing this by first counting ALL of the records below the 2011-10-20 date and then counting ALL of the records below the 2011-10-19 date. You also have THREE SELECT statements in that query.

Unless I'm mistaken thats the same as counting all of the records between 2011-10-19 and 2011-10-20, which you could do with the following:

SELECT COUNT(DISTINCT id,subid) AS unique_ids FROM tb_stats WHERE subid = '1' AND date <= '2011-10-20' AND date >= '2011-10-19';

You should also start using MySQLi or PDO in PHP if possible for store procedures, which could also be a performance booster.

In addition you should run as many queries as possible over a single connection to reduce connection latency (it adds up!)

One final potential boon would be to write MySQL functions. It would be possible to run the query above without using COUNT or DISTINCT with a MySQL function which would be a performance booster beyond the boost given by running it as a function.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文