php、mysql,我的内存泄漏
我没想到这个脚本(扔掉)会泄露,而且我还没弄清楚罪魁祸首是什么。你能发现什么吗?尽管这是一次性代码,但我担心将来会重复此操作。我从来没有必要在 PHP 中管理内存,但是随着数据库中行数的增加,它耗尽了我的 php 实例(已经将内存增加到 1Gb)。
california 表比其他表尤其大(当前有 220 万行,当我删除重复行时会减少)。我在第 31 行收到内存错误 ($row = mysql_fetch_assoc($res))
致命错误:允许的内存大小 1073741824 字节已耗尽(已尝试 在 C:\Documents and Settings\R\My Documents\My 中分配 24 个字节) Webpages\cdiac\cdiac_ dup.php 第 31 行
PHP 5.3.0,mysql 5.1.36。 wamp 安装的一部分。
这是完整的代码。该脚本的目的是删除重复的条目(数据被获取到分段表中,这在当时要快得多,但现在我必须合并这些表。)
是什么原因造成的?我忽略了什么?或者我只需要观察内存大小并在内存变大时手动调用垃圾收集?
<?php
define('DBSERVER', 'localhost');
define('DBNAME', '---');
define('DBUSERNAME', '---');
define('DBPASSWORD', '---');
$dblink = mysql_connect(DBSERVER, DBUSERNAME, DBPASSWORD);
mysql_select_db(DBNAME, $dblink);
$state = "AL";
//if (isset($_GET['state'])) $state=mysql_real_escape_string($_GET['state']);
if (isset($argv[1]) ) $state = $argv[1];
echo "Scanning $state\n\n";
// interate through listing of a state to check for duplicate entries (same station_id, year, month, day)
$DBTABLE = "cdiac_data_". $state;
$query = "select * from $DBTABLE ";
$query .= " order by station_id, year, month, day ";
$res = mysql_query($query) or die ("could not run query '$query': " . mysql_errno() . " " . mysql_error());
$last = "";
$prev_row;
$i = 1;
$counter = 0;
echo ".\n";
while ($row = mysql_fetch_assoc($res)) {
$current = $row["station_id"] . "_" . $row["year"] . "_" . sprintf("%02d",$row["month"]) . "_" . sprintf("%02d",$row["day"]);
echo str_repeat(chr(8), 80) . "$i $current ";
if ($last == $current) {
//echo implode(', ', $row) . "\n";
// merge $row and $prev_row
// data_id station_id, state_abbrev, year, month, day, TMIN, TMIN_flags, TMAX, TMAX_flags, PRCP, PRCP_flags, SNOW, SNOW_flags, SNWD, SNWD_flags
printf("%-13s %8s %8s\n", "data_id:", $prev_row["data_id"], $row["data_id"]);
if ($prev_row["data_id"] == $row["data_id"]) echo " + ";
$set = "";
if (!$prev_row["TMIN"] && $row["TMIN"]) $set .= "TMIN = " . $row["TMIN"] . ", ";
if (!$prev_row["TMIN_flags"] && $row["TMIN_flags"]) $set .= "TMIN_flags = '" . $row["TMIN_flags"] . "', ";
if (!$prev_row["TMAX"] && $row["TMAX"]) $set .= "TMAX = " . $row["TMAX"] . ", ";
if (!$prev_row["TMAX_flags"] && $row["TMAX_flags"]) $set .= "TMAX_flags = '" . $row["TMAX_flags"] . "', ";
if (!$prev_row["PRCP"] && $row["PRCP"]) $set .= "PRCP = " . $row["PRCP"] . ", ";
if (!$prev_row["PRCP_flags"] && $row["PRCP_flags"]) $set .= "PRCP_flags = '" . $row["PRCP_flags"] . "', ";
if (!$prev_row["SNOW"] && $row["SNOW"]) $set .= "SNOW = " . $row["SNOW"] . ", ";
if (!$prev_row["SNOW_flags"] && $row["SNOW_flags"]) $set .= "SNOW_flags = '" . $row["SNOW_flags"] . "', ";
if (!$prev_row["SNWD"] && $row["SNWD"]) $set .= "SNWD = " . $row["SNWD"] . ", ";
if (!$prev_row["SNWD_flags"] && $row["SNWD_flags"]) $set .= "SNWD_flags = '" . $row["SNWD_flags"] . "', ";
$delete = "";
$update = "";
if ($set = substr_replace( $set, "", -2 )) $update = "UPDATE $DBTABLE SET $set WHERE data_id=".$prev_row["data_id"]." and year=".$row["year"]." and month=".$row["month"]." and day=".$row["day"].";\n";
if ($row["data_id"] != $prev_row["data_id"]) $delete = "delete from $DBTABLE where data_id=".$row["data_id"]." and year=".$row["year"]." and month=".$row["month"]." and day=".$row["day"].";\n\n";
if ($update) {
$r = mysql_query($update) or die ("could not run query '$update' \n".mysql_error());
}
if ($delete) {
$r = mysql_query($delete) or die ("could not run query '$delete' \n".mysql_error());
}
//if ($counter++ > 5) exit(0);
}
else {
$last = $current;
unset($prev_row);
//copy $row to $prev_row
foreach ($row as $key => $val) $prev_row[$key] = $val;
}
$i++;
}
echo "\n\nDONE\n";
?>
I didn't expect this script (throw-away) to be leaking and I haven't figured out what the culprit is. Can you spot anything? Although this is throw-away code, I'm concerned that I'll repeat this in the future. I've never had to manage memory in PHP, but with the number of rows in the db, it's blowing up my php instance (already upped the memory to 1Gb).
The california table is especially larger than the others (currently 2.2m rows, less as I delete duplicate rows). I get a memory error on line 31 ($row = mysql_fetch_assoc($res))
Fatal error: Allowed memory size of 1073741824 bytes exhausted (tried
to allocat e 24 bytes) in C:\Documents and Settings\R\My Documents\My
Webpages\cdiac\cdiac_ dup.php on line 31
PHP 5.3.0, mysql 5.1.36. part of a wamp install.
here's the entire code. the purpose of this script is to delete duplicate entries (data was acquired into segmented tables, which was far faster at the time, but now I have to merge those tables.)
what's causing it? something I'm overlooking? or do I just need to watch the memory size and call garbage collection manually when it gets big?
<?php
define('DBSERVER', 'localhost');
define('DBNAME', '---');
define('DBUSERNAME', '---');
define('DBPASSWORD', '---');
$dblink = mysql_connect(DBSERVER, DBUSERNAME, DBPASSWORD);
mysql_select_db(DBNAME, $dblink);
$state = "AL";
//if (isset($_GET['state'])) $state=mysql_real_escape_string($_GET['state']);
if (isset($argv[1]) ) $state = $argv[1];
echo "Scanning $state\n\n";
// interate through listing of a state to check for duplicate entries (same station_id, year, month, day)
$DBTABLE = "cdiac_data_". $state;
$query = "select * from $DBTABLE ";
$query .= " order by station_id, year, month, day ";
$res = mysql_query($query) or die ("could not run query '$query': " . mysql_errno() . " " . mysql_error());
$last = "";
$prev_row;
$i = 1;
$counter = 0;
echo ".\n";
while ($row = mysql_fetch_assoc($res)) {
$current = $row["station_id"] . "_" . $row["year"] . "_" . sprintf("%02d",$row["month"]) . "_" . sprintf("%02d",$row["day"]);
echo str_repeat(chr(8), 80) . "$i $current ";
if ($last == $current) {
//echo implode(', ', $row) . "\n";
// merge $row and $prev_row
// data_id station_id, state_abbrev, year, month, day, TMIN, TMIN_flags, TMAX, TMAX_flags, PRCP, PRCP_flags, SNOW, SNOW_flags, SNWD, SNWD_flags
printf("%-13s %8s %8s\n", "data_id:", $prev_row["data_id"], $row["data_id"]);
if ($prev_row["data_id"] == $row["data_id"]) echo " + ";
$set = "";
if (!$prev_row["TMIN"] && $row["TMIN"]) $set .= "TMIN = " . $row["TMIN"] . ", ";
if (!$prev_row["TMIN_flags"] && $row["TMIN_flags"]) $set .= "TMIN_flags = '" . $row["TMIN_flags"] . "', ";
if (!$prev_row["TMAX"] && $row["TMAX"]) $set .= "TMAX = " . $row["TMAX"] . ", ";
if (!$prev_row["TMAX_flags"] && $row["TMAX_flags"]) $set .= "TMAX_flags = '" . $row["TMAX_flags"] . "', ";
if (!$prev_row["PRCP"] && $row["PRCP"]) $set .= "PRCP = " . $row["PRCP"] . ", ";
if (!$prev_row["PRCP_flags"] && $row["PRCP_flags"]) $set .= "PRCP_flags = '" . $row["PRCP_flags"] . "', ";
if (!$prev_row["SNOW"] && $row["SNOW"]) $set .= "SNOW = " . $row["SNOW"] . ", ";
if (!$prev_row["SNOW_flags"] && $row["SNOW_flags"]) $set .= "SNOW_flags = '" . $row["SNOW_flags"] . "', ";
if (!$prev_row["SNWD"] && $row["SNWD"]) $set .= "SNWD = " . $row["SNWD"] . ", ";
if (!$prev_row["SNWD_flags"] && $row["SNWD_flags"]) $set .= "SNWD_flags = '" . $row["SNWD_flags"] . "', ";
$delete = "";
$update = "";
if ($set = substr_replace( $set, "", -2 )) $update = "UPDATE $DBTABLE SET $set WHERE data_id=".$prev_row["data_id"]." and year=".$row["year"]." and month=".$row["month"]." and day=".$row["day"].";\n";
if ($row["data_id"] != $prev_row["data_id"]) $delete = "delete from $DBTABLE where data_id=".$row["data_id"]." and year=".$row["year"]." and month=".$row["month"]." and day=".$row["day"].";\n\n";
if ($update) {
$r = mysql_query($update) or die ("could not run query '$update' \n".mysql_error());
}
if ($delete) {
$r = mysql_query($delete) or die ("could not run query '$delete' \n".mysql_error());
}
//if ($counter++ > 5) exit(0);
}
else {
$last = $current;
unset($prev_row);
//copy $row to $prev_row
foreach ($row as $key => $val) $prev_row[$key] = $val;
}
$i++;
}
echo "\n\nDONE\n";
?>
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我会尝试两件事:
1)不要使用 mysql_query 在循环内运行 UPDATE 和 DELETE 查询,而是将它们保存在文本文件中,以便稍后执行。例如:
file_put_contents('queries.sql', $update, FILE_APPEND );
2) 不要在
while ($row = mysql_fetch_assoc($res))
内执行所有操作循环,首先获取所有 SELECT 查询结果,然后关闭数据库连接,释放所有数据库资源,包括查询结果。只有在此之后,才执行循环过程。如果在将数据库结果存储在一个数组中时内存不足,您可以尝试将结果保存在临时文件中(每行一条记录/FILE_APPEND),然后在循环中使用该文件(每条记录读取一行,使用
fgets
函数)。I would try two things:
1) Instead of running the UPDATE and DELETE queries inside the loop using
mysql_query
, save them in a text file, to execute later. For example:file_put_contents('queries.sql', $update, FILE_APPEND );
2) Instead of doing everything inside the
while ($row = mysql_fetch_assoc($res))
loop, first grab all SELECT query results, then close database connection freeing all database resources, including the query result. Only after this, perform the loop process.If you run out of memory while storing the database results in one array, you can try saving the results in a temporary file instead (one record per line / FILE_APPEND), and then use this file in the loop (reading one line per record, using
fgets
function).更聪明地工作,而不是更努力:
这将获得表中多次出现的所有 station_id/year/month 元组。假设大部分数据不重复,这将为您节省大量内存,因为现在您只需遍历这些元组并修复与它们匹配的行。
Work smarter, not harder:
That'll get you all the station_id/year/month tuples that appear in the table more than once. Assuming that most of your data is not duplicates, that'll save you a lot of memory, since now you just have to go through these tuples and fix up the rows matching them.
我在尝试追踪我的脚本上的内存使用问题时发现了这一点。解决了我的问题后,我认为值得在这里为下一个遇到相同问题的人添加回复。
我使用的是 mysqli,但同样适用于 mysql。
我发现的问题是查询没有释放其结果。解决方案是在执行更新和删除查询后使用 mysqli_free_result()。但更重要的是,在循环的 mysqli_query 上,我使用了额外的参数 *MYSQLI_USE_RESULT* 。这样做会产生副作用,因此请为更新和删除查询使用单独的连接。
I found this when trying to trace down a memory use problem on a script of mine. Having solved the issue for mine I thought it worth adding a reply here for the next person who comes along with the same issue.
I was using mysqli, but much the same applies for mysql.
The problem I found was the queries not freeing their results. The solution was to use mysqli_free_result() after executing the update and delete queries. But more importantly on the mysqli_query for the loop I used the extra parameter of *MYSQLI_USE_RESULT* . There are side effects of this, so use a separate connection for the update and delete queries.