在 php 中破解我认为第二简单的正则表达式类型(从某些字符串中提取匹配的字符串并使用它),但正则表达式分组似乎让我困惑。
目标
- 获取
ls
文件,输出命令以格式化/复制文件以具有正确的命名格式。
- 调整文件副本的大小以创建缩略图。 (甚至还没有处理该步骤)
失败
我的代码在正则表达式步骤失败,因为虽然我只想过滤掉除单个正则表达式组之外的所有内容,但当我得到结果时,它总是返回我想要的组 - 以及它之前的组,即使我根本没有请求第一个回溯组。
这是在线 ide 上功能齐全、可运行的代码版本:
http://ideone.com/2RiqN
这是代码(带有缩减的初始数据集,尽管我不知道根本不要指望这很重要):
<?php
// Long list of image names.
$file_data = <<<HEREDOC
07184_A.jpg
Adrian-Chelsea-C08752_A.jpg
Air-Adams-Cap-Toe-Oxford-C09167_A.jpg
Air-Adams-Split-Toe-Oxford-C09161_A.jpg
Air-Adams-Venetian-C09165_A.jpg
Air-Aiden-Casual-Camp-Moc-C09347_A.jpg
C05820_A.jpg
C06588_A.jpg
Air-Aiden-Classic-Bit-C09007_A.jpg
Work-Moc-Toe-Boot-C09095_A.jpg
HEREDOC;
if($file_data){
$files = preg_split("/[\s,]+/", $file_data);
// Split up the files based on the newlines.
}
$rename_candidates = array();
$i = 0;
foreach($files as $file){
$string = $file;
$pattern = '#(\w)(\d+)_A\.jpg$#i';
// Use the second regex group for the results.
$replacement = '$2';
// This should return only group 2 (any number of digits), but instead group 1 is somehow always in there.
$new_file_part = preg_replace($pattern, $replacement, $string);
// Example good end result: <img src="images/ch/ch-07184fs.jpg" width="350" border="0">
// Save the rename results for further processing later.
$rename_candidates[$i]=array('file'=>$file, 'new_file'=>$new_file_part);
// Rename the images into a standard format.
echo "cp ".$file." ./ch/ch-".$new_file_part."fs.jpg;";
// Echo out some commands for later.
echo "<br>";
$i++;
if($i>10){break;} // Just deal with the first 10 for now.
}
?>
正则表达式的预期结果:788750
代码输出的预期结果(多行): cp air-something-something-C485850_A.jpg ./ch/ch-485850.jpg;
我的正则表达式有什么问题?对于更简单的匹配代码的建议也将不胜感激。
Hacking up what I thought was the second simplest type of regex (extract a matching string from some strings, and use it) in php, but regex grouping seems to be tripping me up.
Objective
- take a
ls
of files, output the commands to format/copy the files to have the correct naming format.
- Resize copies of the files to create thumbnails. (not even dealing with that step yet)
Failure
My code fails at the regex step, because although I just want to filter out everything except a single regex group, when I get the results, it's always returning the group that I want -and- the group before it, even though I in no way requested the first backtrace group.
Here is a fully functioning, runnable version of the code on the online ide:
http://ideone.com/2RiqN
And here is the code (with a cut down initial dataset, although I don't expect that to matter at all):
<?php
// Long list of image names.
$file_data = <<<HEREDOC
07184_A.jpg
Adrian-Chelsea-C08752_A.jpg
Air-Adams-Cap-Toe-Oxford-C09167_A.jpg
Air-Adams-Split-Toe-Oxford-C09161_A.jpg
Air-Adams-Venetian-C09165_A.jpg
Air-Aiden-Casual-Camp-Moc-C09347_A.jpg
C05820_A.jpg
C06588_A.jpg
Air-Aiden-Classic-Bit-C09007_A.jpg
Work-Moc-Toe-Boot-C09095_A.jpg
HEREDOC;
if($file_data){
$files = preg_split("/[\s,]+/", $file_data);
// Split up the files based on the newlines.
}
$rename_candidates = array();
$i = 0;
foreach($files as $file){
$string = $file;
$pattern = '#(\w)(\d+)_A\.jpg$#i';
// Use the second regex group for the results.
$replacement = '$2';
// This should return only group 2 (any number of digits), but instead group 1 is somehow always in there.
$new_file_part = preg_replace($pattern, $replacement, $string);
// Example good end result: <img src="images/ch/ch-07184fs.jpg" width="350" border="0">
// Save the rename results for further processing later.
$rename_candidates[$i]=array('file'=>$file, 'new_file'=>$new_file_part);
// Rename the images into a standard format.
echo "cp ".$file." ./ch/ch-".$new_file_part."fs.jpg;";
// Echo out some commands for later.
echo "<br>";
$i++;
if($i>10){break;} // Just deal with the first 10 for now.
}
?>
Intended result for the regex: 788750
Intended result for the code output (multiple lines of): cp air-something-something-C485850_A.jpg ./ch/ch-485850.jpg;
What's wrong with my regex? Suggestions for simpler matching code would be appreciated as well.
发布评论
评论(2)
只是猜测:
这包括匹配中的整个文件名。否则,
preg_replace()
实际上只会替换每个字符串的末尾 - 它只会将$replacement
表达式应用于实际匹配的部分。Just a guess:
This includes the whole filename in the match. Otherwise
preg_replace()
will really only substitute the end of each string - it only applies the$replacement
expression on the part that was actually matched.扫描目录和Expode
你知道吗?在 php 中执行此操作的一种更简单的方法是使用 scandir 和爆炸组合
explode: 基本上,像
blah-2.jpg
这样的文件名会变成array( 'blah','2.jpg);
然后使用end()
获取最后一个元素。它几乎与 array_pop() 相同;工作示例
这是我的ideaone代码 http://ideone.com/gLSxA
Scan Dir and Expode
You know what? A simpler way to do it in php is to use scandir and explode combo
explode: basically a filename like
blah-2.jpg
is turned into a anarray('blah','2.jpg);
and then taking theend()
of that gets the last element. It's the same almost as array_pop();Working Example
Here's my ideaone code http://ideone.com/gLSxA