php 最简单的正则表达式替换,但回溯不起作用

发布于 2024-10-21 12:25:27 字数 2025 浏览 2 评论 0 原文

在 php 中破解我认为第二简单的正则表达式类型(从某些字符串中提取匹配的字符串并使用它),但正则表达式分组似乎让我困惑。

目标

  1. 获取ls文件,输出命令以格式化/复制文件以具有正确的命名格式。
  2. 调整文件副本的大小以创建缩略图。 (甚至还没有处理该步骤)

失败

我的代码在正则表达式步骤失败,因为虽然我只想过滤掉除单个正则表达式组之外的所有内容,但当我得到结果时,它总是返回我想要的组 - 以及它之前的组,即使我根本没有请求第一个回溯组。

这是在线 ide 上功能齐全、可运行的代码版本: http://ideone.com/2RiqN

这是代码(带有缩减的初始数据集,尽管我不知道根本不要指望这很重要):

<?php

// Long list of image names.
$file_data = <<<HEREDOC
07184_A.jpg
Adrian-Chelsea-C08752_A.jpg
Air-Adams-Cap-Toe-Oxford-C09167_A.jpg
Air-Adams-Split-Toe-Oxford-C09161_A.jpg
Air-Adams-Venetian-C09165_A.jpg
Air-Aiden-Casual-Camp-Moc-C09347_A.jpg
C05820_A.jpg
C06588_A.jpg
Air-Aiden-Classic-Bit-C09007_A.jpg
Work-Moc-Toe-Boot-C09095_A.jpg
HEREDOC;

if($file_data){
    $files = preg_split("/[\s,]+/", $file_data);
    // Split up the files based on the newlines.
}
$rename_candidates = array();
$i = 0;
foreach($files as $file){
    $string = $file;
    $pattern = '#(\w)(\d+)_A\.jpg$#i';
    // Use the second regex group for the results.
    $replacement = '$2';
    // This should return only group 2 (any number of digits), but instead group 1 is somehow always in there.
    $new_file_part = preg_replace($pattern, $replacement, $string);
// Example good end result: <img src="images/ch/ch-07184fs.jpg" width="350" border="0">
    // Save the rename results for further processing later.
    $rename_candidates[$i]=array('file'=>$file, 'new_file'=>$new_file_part);
    // Rename the images into a standard format.
    echo "cp ".$file." ./ch/ch-".$new_file_part."fs.jpg;";
        // Echo out some commands for later.
    echo "<br>"; 
    $i++;
    if($i>10){break;} // Just deal with the first 10 for now.
}
?>

正则表达式的预期结果:788750 代码输出的预期结果(多行): cp air-something-something-C485850_A.jpg ./ch/ch-485850.jpg;

我的正则表达式有什么问题?对于更简单的匹配代码的建议也将不胜感激。

Hacking up what I thought was the second simplest type of regex (extract a matching string from some strings, and use it) in php, but regex grouping seems to be tripping me up.

Objective

  1. take a ls of files, output the commands to format/copy the files to have the correct naming format.
  2. Resize copies of the files to create thumbnails. (not even dealing with that step yet)

Failure

My code fails at the regex step, because although I just want to filter out everything except a single regex group, when I get the results, it's always returning the group that I want -and- the group before it, even though I in no way requested the first backtrace group.

Here is a fully functioning, runnable version of the code on the online ide:
http://ideone.com/2RiqN

And here is the code (with a cut down initial dataset, although I don't expect that to matter at all):

<?php

// Long list of image names.
$file_data = <<<HEREDOC
07184_A.jpg
Adrian-Chelsea-C08752_A.jpg
Air-Adams-Cap-Toe-Oxford-C09167_A.jpg
Air-Adams-Split-Toe-Oxford-C09161_A.jpg
Air-Adams-Venetian-C09165_A.jpg
Air-Aiden-Casual-Camp-Moc-C09347_A.jpg
C05820_A.jpg
C06588_A.jpg
Air-Aiden-Classic-Bit-C09007_A.jpg
Work-Moc-Toe-Boot-C09095_A.jpg
HEREDOC;

if($file_data){
    $files = preg_split("/[\s,]+/", $file_data);
    // Split up the files based on the newlines.
}
$rename_candidates = array();
$i = 0;
foreach($files as $file){
    $string = $file;
    $pattern = '#(\w)(\d+)_A\.jpg$#i';
    // Use the second regex group for the results.
    $replacement = '$2';
    // This should return only group 2 (any number of digits), but instead group 1 is somehow always in there.
    $new_file_part = preg_replace($pattern, $replacement, $string);
// Example good end result: <img src="images/ch/ch-07184fs.jpg" width="350" border="0">
    // Save the rename results for further processing later.
    $rename_candidates[$i]=array('file'=>$file, 'new_file'=>$new_file_part);
    // Rename the images into a standard format.
    echo "cp ".$file." ./ch/ch-".$new_file_part."fs.jpg;";
        // Echo out some commands for later.
    echo "<br>"; 
    $i++;
    if($i>10){break;} // Just deal with the first 10 for now.
}
?>

Intended result for the regex: 788750
Intended result for the code output (multiple lines of): cp air-something-something-C485850_A.jpg ./ch/ch-485850.jpg;

What's wrong with my regex? Suggestions for simpler matching code would be appreciated as well.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

ぶ宁プ宁ぶ 2024-10-28 12:25:27

只是猜测:

 $pattern = '#^.*?(\w)(\d+)_A\.jpg$#i';

这包括匹配中的整个文件名。否则,preg_replace() 实际上只会替换每个字符串的末尾 - 它只会将 $replacement 表达式应用于实际匹配的部分。

Just a guess:

 $pattern = '#^.*?(\w)(\d+)_A\.jpg$#i';

This includes the whole filename in the match. Otherwise preg_replace() will really only substitute the end of each string - it only applies the $replacement expression on the part that was actually matched.

┾廆蒐ゝ 2024-10-28 12:25:27

扫描目录和Expode

你知道吗?在 php 中执行此操作的一种更简单的方法是使用 scandir 和爆炸组合

  $dir = scandir('/path/to/directory');
    foreach($dir as $file)
{
    $ext = pathinfo($file,PATHINFO_EXTENSION);
    if($ext!='jpg') continue;

    $a = explode('-',$file); //grab the end of the string after the -
    $newfilename = end($a); //if there is no dash just take the whole string

    $newlocation = './ch/ch-'.str_replace(array('C','_A'),'', basename($newfilename,'.jpg')).'fs.jpg';
    echo "@copy($file, $newlocation)\n";

}
#and you are done :)

explode: 基本上,像 blah-2.jpg 这样的文件名会变成 array( 'blah','2.jpg); 然后使用 end() 获取最后一个元素。它几乎与 array_pop() 相同;

工作示例

这是我的ideaone代码 http://ideone.com/gLSxA

Scan Dir and Expode

You know what? A simpler way to do it in php is to use scandir and explode combo

  $dir = scandir('/path/to/directory');
    foreach($dir as $file)
{
    $ext = pathinfo($file,PATHINFO_EXTENSION);
    if($ext!='jpg') continue;

    $a = explode('-',$file); //grab the end of the string after the -
    $newfilename = end($a); //if there is no dash just take the whole string

    $newlocation = './ch/ch-'.str_replace(array('C','_A'),'', basename($newfilename,'.jpg')).'fs.jpg';
    echo "@copy($file, $newlocation)\n";

}
#and you are done :)

explode: basically a filename like blah-2.jpg is turned into a an array('blah','2.jpg); and then taking the end() of that gets the last element. It's the same almost as array_pop();

Working Example

Here's my ideaone code http://ideone.com/gLSxA

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文