用php替换大文件中的字符串

发布于 08-19 05:12 字数 106 浏览 9 评论 0原文

我正在尝试在 PHP 中对整个文件进行字符串替换。我的文件超过 100MB,所以我必须逐行查看,并且无法使用 file_get_contents()。对此有好的解决办法吗?

I am trying to do a string replace for entire file in PHP. My file is over 100MB so I have to go line by line and can not use file_get_contents(). Is there a good solution to this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

我不在是我2024-08-26 05:12:07

如果您不需要使用 PHP,我强烈建议您从命令行执行类似的操作。它是迄今为止最适合这项工作的工具,而且更容易使用。

无论如何,sed(流编辑器) 命令就是您正在寻找:

sed s/search/replace oldfilename > newfilename

如果您需要不区分大小写:

sed s/search/replace/i oldfilename > newfilename

如果您需要在 PHP 中动态执行此操作,您可以使用 passthru()

$output = passthru("sed s/$search/$replace $oldfilename > $newfilename");

If you aren't required to use PHP, I would highly recommend performing stuff like this from the command line. It's by far the best tool for the job, and much easier to use.

In any case, the sed (Stream Editor) command is what you are looking for:

sed s/search/replace oldfilename > newfilename

If you need case-insensitivity:

sed s/search/replace/i oldfilename > newfilename

If you need this to perform dynamically within PHP, you can use passthru():

$output = passthru("sed s/$search/$replace $oldfilename > $newfilename");
碍人泪离人颜2024-08-26 05:12:07

给你:

function replace_file($path, $string, $replace)
{
    set_time_limit(0);

    if (is_file($path) === true)
    {
        $file = fopen($path, 'r');
        $temp = tempnam('./', 'tmp');

        if (is_resource($file) === true)
        {
            while (feof($file) === false)
            {
                file_put_contents($temp, str_replace($string, $replace, fgets($file)), FILE_APPEND);
            }

            fclose($file);
        }

        unlink($path);
    }

    return rename($temp, $path);
}

这样称呼它:

replace_file('/path/to/fruits.txt', 'apples', 'oranges');

Here you go:

function replace_file($path, $string, $replace)
{
    set_time_limit(0);

    if (is_file($path) === true)
    {
        $file = fopen($path, 'r');
        $temp = tempnam('./', 'tmp');

        if (is_resource($file) === true)
        {
            while (feof($file) === false)
            {
                file_put_contents($temp, str_replace($string, $replace, fgets($file)), FILE_APPEND);
            }

            fclose($file);
        }

        unlink($path);
    }

    return rename($temp, $path);
}

Call it like this:

replace_file('/path/to/fruits.txt', 'apples', 'oranges');
意犹2024-08-26 05:12:07

如果您不能直接从命令行使用 sed,因为它是一个动态任务,并且您需要从 php 调用它,那么很难获得正确的语法:您必须在搜索和替换字符串中以不同的方式转义这些字符

' / $ . * [ ] \ ^ &

以下函数搜索并替换文件中的字符串而不将搜索到的字符串解释为正则表达式。因此,如果您愿意,可以搜索字符串“.*”并将其替换为“$”。

/**
 * str_replace_with_sed($search, $replace, $file_in, $file_out=null)
 * 
 * Search for the fixed string `$search` inside the file `$file_in`
 * and replace it with `$replace`. The replace occurs in-place unless
 * `$file_out` is defined: in that case the resulting file is written
 * into `$file_out`
 *
 * Return: sed return status (0 means success, any other integer failure)
 */
function str_replace_with_sed($search, $replace, $file_in, $file_out=null)
{
    $cmd_opts = '';
    if (! $file_out) 
    {
        // replace inline in $file_in
        $cmd_opts .= ' -i';
    }

    // We will use Basic Regular Expressions (BRE). This means that in the 
    // search pattern we must escape
    // $.*[\]^
    //
    // The replacement string must have these characters escaped
    // \ & 
    //
    // In both cases we must escape the separator character too ( usually / )
    // 
    // Since we run the command trough the shell we We must escape the string
    // too (yai!). We're delimiting the string with single quotes (') and we'll
    // escape them with '\'' (close string, write a single quote, reopen string)    

    // Replace all the backslashes as first thing. If we do it in the following
    // batch replace we would end up with bogus results
    $search_pattern = str_replace('\\', '\\\\', $search);

    $search_pattern = str_replace(array('
, '.', '*', '[', ']', '^'),
                                  array('\\
, '\\.', '\\*', '\\[', '\\]', '\\^'),
                                  $search_pattern);

    $replace_string = str_replace(array('\\', '&'),
                                  array('\\\\', '\\&'),
                                  $replace);

    $output_suffix = $file_out ? " > '$file_out' " : '';
    $cmd = sprintf("sed ".$cmd_opts." -e 's/%s/%s/g' \"%s\" ".$output_suffix,
                    str_replace('/','\\/', # escape the regexp separator
                      str_replace("'", "'\''", $search_pattern) // sh string escape
                    ),
                    str_replace('/','\\/', # escape the regexp separator
                      str_replace("'", "'\''", $replace_string) // sh string escape
                    ),
                    $file_in
                  );

    passthru($cmd, $status);

    return $status;
}

If you can't use directly sed from command line because it's a dynamic task and you need to call it from php it's difficult to get the syntax right: you must escape in different ways in the search and replacement strings these characters

' / $ . * [ ] \ ^ &

The following function search and replace a string in a file without interpreting the searched string as a regular expression. So if you wanted you could search for the string ".*" and replace it with "$".

/**
 * str_replace_with_sed($search, $replace, $file_in, $file_out=null)
 * 
 * Search for the fixed string `$search` inside the file `$file_in`
 * and replace it with `$replace`. The replace occurs in-place unless
 * `$file_out` is defined: in that case the resulting file is written
 * into `$file_out`
 *
 * Return: sed return status (0 means success, any other integer failure)
 */
function str_replace_with_sed($search, $replace, $file_in, $file_out=null)
{
    $cmd_opts = '';
    if (! $file_out) 
    {
        // replace inline in $file_in
        $cmd_opts .= ' -i';
    }

    // We will use Basic Regular Expressions (BRE). This means that in the 
    // search pattern we must escape
    // $.*[\]^
    //
    // The replacement string must have these characters escaped
    // \ & 
    //
    // In both cases we must escape the separator character too ( usually / )
    // 
    // Since we run the command trough the shell we We must escape the string
    // too (yai!). We're delimiting the string with single quotes (') and we'll
    // escape them with '\'' (close string, write a single quote, reopen string)    

    // Replace all the backslashes as first thing. If we do it in the following
    // batch replace we would end up with bogus results
    $search_pattern = str_replace('\\', '\\\\', $search);

    $search_pattern = str_replace(array('
, '.', '*', '[', ']', '^'),
                                  array('\\
, '\\.', '\\*', '\\[', '\\]', '\\^'),
                                  $search_pattern);

    $replace_string = str_replace(array('\\', '&'),
                                  array('\\\\', '\\&'),
                                  $replace);

    $output_suffix = $file_out ? " > '$file_out' " : '';
    $cmd = sprintf("sed ".$cmd_opts." -e 's/%s/%s/g' \"%s\" ".$output_suffix,
                    str_replace('/','\\/', # escape the regexp separator
                      str_replace("'", "'\''", $search_pattern) // sh string escape
                    ),
                    str_replace('/','\\/', # escape the regexp separator
                      str_replace("'", "'\''", $replace_string) // sh string escape
                    ),
                    $file_in
                  );

    passthru($cmd, $status);

    return $status;
}
口干舌燥2024-08-26 05:12:07

我会以更明确的方式使用“sed”,这样您就可以减少对系统的依赖。

$output = passthru("sed -e 's/$search/$replace/g' $oldfilename > $newfilename");

I would have used 'sed' in a more explicit way, so you are less dependent of your system.

$output = passthru("sed -e 's/$search/$replace/g' $oldfilename > $newfilename");
野の2024-08-26 05:12:07

一次获取几行,转储变量,获取接下来的几行。

$fh = fopen("bigfile.txt", "flags");
$num = 0;
$length = 300;
$filesize = filesize("bigfile.txt");

while($num < $filesize)
{
     $contents = fread($fh, $length);
     // .. do stuff ...
     $num = $num+$length;
     fseek($fh, $num);
}

fclose($fh);

您需要确保这是正确的(尚未测试)。请参阅 PHP 文档 上的库。

棘手的部分是写回文件。我脑海中浮现的第一个想法是进行字符串替换,将新内容写入另一个文件,然后最后删除旧文件并用新文件替换它。

Get it a few lines at a time, dump the variable, get the next few lines.

$fh = fopen("bigfile.txt", "flags");
$num = 0;
$length = 300;
$filesize = filesize("bigfile.txt");

while($num < $filesize)
{
     $contents = fread($fh, $length);
     // .. do stuff ...
     $num = $num+$length;
     fseek($fh, $num);
}

fclose($fh);

You are going to want to make sure that is correct (haven't tested). See the library on PHP Documentation.

The tricky part is going to be writing back to the file. The first idea that pops into my mind is do the string replace, write the new content to another file, and then at the end, delete the old file and replace it with the new one.

哀由2024-08-26 05:12:07

像这样的东西?

$infile="file";
$outfile="temp";
$f = fopen($infile,"r");
$o = fopen($outfile,"a");
$pattern="pattern";
$replace="replace";
if($f){
     while( !feof($f) ){
        $line = fgets($f,4096);
        if ( strpos($pattern,"$line") !==FALSE ){
            $line=str_replace($pattern,$replace,$line);
        }
        fwrite($o,$line);
     }
}
fclose($f);
fclose($o);
rename($outfile,$infile);

something like this?

$infile="file";
$outfile="temp";
$f = fopen($infile,"r");
$o = fopen($outfile,"a");
$pattern="pattern";
$replace="replace";
if($f){
     while( !feof($f) ){
        $line = fgets($f,4096);
        if ( strpos($pattern,"$line") !==FALSE ){
            $line=str_replace($pattern,$replace,$line);
        }
        fwrite($o,$line);
     }
}
fclose($f);
fclose($o);
rename($outfile,$infile);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文