我想我需要某种 Schwartzian Transform 才能正常工作,但我很难弄清楚,因为 perl 不是我最强的语言。
我有一个目录,其内容如下:
album1.htm
album2.htm
album3.htm
....
album99.htm
album100.htm
我试图从此目录中获取编号最高的专辑(在本例中为album100.htm)。请注意,文件上的时间戳并不是确定事物的可靠方法,因为人们会在事后添加旧的“丢失”专辑。
以前的开发人员只是使用了下面的代码片段,但是一旦目录中的专辑超过 9 个,这显然就会失败。
opendir(DIR, PATH) || print $!;
@files = readdir(DIR);
foreach $file ( sort(@files) ) {
if ( $file =~ /album/ ) {
$last_file = $file;
}
}
I think I need some sort of Schwartzian Transform to get this working, but I'm having trouble figuring it out, as perl isn't my strongest language.
I have a directory with contents as such:
album1.htm
album2.htm
album3.htm
....
album99.htm
album100.htm
I'm trying to get the album with the highest number from this directory (in this case, album100.htm). Note that timestamps on the files are not a reliable means of determining things, as people are adding old "missing" albums after the fact.
The previous developer simply used the code snippet below, but this clearly breaks down once there are more than 9 albums in a directory.
opendir(DIR, PATH) || print $!;
@files = readdir(DIR);
foreach $file ( sort(@files) ) {
if ( $file =~ /album/ ) {
$last_file = $file;
}
}
发布评论
评论(6)
如果您只需要找到编号最高的专辑,则实际上不需要对列表进行排序,只需遍历它并跟踪最大值即可。
If you just need to find the album with the highest number, you don't really need to sort the list, just run through it and keep track of the maximum.
要找到最高的数字,请尝试自定义排序...
另外,请查看 File::Next 模块。它会让您只挑选以“专辑”一词开头的文件。我发现它比 readdir 更容易一些。
To find the highest number, try a custom sort...
Also, take a look at the File::Next module. It will let you pick out just the files that begin with the word "album". I find it a little easier than readdir.
您遇到困难的原因是运算符,
<=>
是数字比较,cmp
是默认,它是字符串比较。通过稍加修改,我们得到的结果更接近正确:
但是,在您的情况下,您需要删除非数字。
这是更漂亮的:
这并不完美,但它应该让您对排序问题有一个很好的了解。
哦,作为后续,Shcwartzian 变换解决了一个不同的问题:它使您不必在搜索算法。它的代价是必须缓存结果(这并不意外),从而产生内存成本。本质上,您所做的是将问题的输入映射到输出(通常在数组中)
[$input, $output]
,然后对输出进行排序$a-> [1] <=> $b->[1]
。现在您的内容已经排序完毕,您可以重新映射以获得原始输入$_->[0]
。它很酷,因为它非常紧凑,同时又非常高效。
The reason why you're encountering difficulties is the operator,
<=>
is the numeric comparison,cmp
is the default and it is string comparison.With a slight modification we get something much closer to correct:
However, in your case you need to remove the non digits.
Here is it more pretty:
This isn't flawless, but it should give you a good idea of your issue with sort.
Oh, and as a follow up, the Shcwartzian Transform solves a different problem: it stops you from having to run a complex task (unlike the one you're needing -- a regex) multiple times in the search algorithm. It comes at a memory cost of having to cache the results (not to be unexpected). Essentially, what you do is map the input of the problem, to the output (typically in an array)
[$input, $output]
then you sort on the outputs$a->[1] <=> $b->[1]
. With your stuff now sorted you map back over to get your original inputs$_->[0]
.It is cool because it is so compact while being so efficient.
在这里,使用 Schwartzian 变换:
Here you go, using Schwartzian Transform:
这是使用 reduce 的替代解决方案:
Here's an alternative solution using reduce:
这是一个通用的解决方案:
Here's a generic solution: