如何按文件大小匹配文件并相应地重命名?

发布于 2024-11-29 18:55:03 字数 627 浏览 3 评论 0原文

我有两个名称不匹配的图像目录,但大部分图像都是匹配的。

Dir 1       Size   | Dir 2                  Size
---------------------------------------------------
img1.jpg    508960 | a_image_name.jpg       1038644
img2.jpg    811430 | another_image_name.jpg 396240
...         ...    | ...                    ...
img1000.jpg 602583 | image_name.jpg         811430
...         ...    | 
img2000.jpg 396240 | 

第一个目录有更多图像,但命名错误。第二个目录具有正确的名称,但与第一个目录的顺序不对应。

我想通过将文件大小(或其他方式)与 Dir 2 进行比较来重命名 Dir 1 中的文件。在上面的示例中,img2.jpg 将被重命名为 image_name.jpg,因为两者具有相同的文件大小。

你能指出我正确的方向吗?

最好通过应用程序 (Mac)、shell 或 php 的方式。

I have two directories of images with mismatching names, but mostly matching images.

Dir 1       Size   | Dir 2                  Size
---------------------------------------------------
img1.jpg    508960 | a_image_name.jpg       1038644
img2.jpg    811430 | another_image_name.jpg 396240
...         ...    | ...                    ...
img1000.jpg 602583 | image_name.jpg         811430
...         ...    | 
img2000.jpg 396240 | 

The first directory has more images, but is misnamed. The second directory has the correct names, but not corresponding in order to the first directory.

I'd like to rename files in Dir 1 by comparing file size (or some other way) to Dir 2. In the above example img2.jpg would be renamed to image_name.jpg because both have the same file size.

Can you point me in the right direction?

Preferably by way of app (Mac), shell, or php.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

雪落纷纷 2024-12-06 18:55:03

也许使用文件的哈希值而不是使用文件大小会更明智?

简而言之:使用 glob() 获取 dir1 中的文件列表,迭代,创建 md5 哈希 (md5() + file_get_contents()),存储在数组中,使用哈希作为键,使用文件名作为值。
对 dir2 执行相同操作。

迭代 array1,如果 array2 重命名文件中存在具有相同哈希值的条目代码

将如下所示:(未经测试,未优化)

$dir1 = array();
$dir2 = array();

// get hashes for dir1
foreach( glob( '/path/to/dir1/*.jpg' ) as $file ) {
 $hash = md5( file_get_contents( $file ) );
 $dir1[ $hash ] = $file;
}

// repeat for dir2 ...

foreach( $dir1 as $hash => $file1 ) {
 if( array_key_exists( $hash, $dir2 ) ) {
  rename( $file1, $dir2[ $hash ] );
 }
}

Maybe it would be wiser to use hashes of the files instead of using the filesize?

In short: using glob(), get a list of files in dir1, iterate, create md5-hash (md5() + file_get_contents()), store in an array, using the hash as key and the filename as value.
Do the same for dir2.

iterate array1, if an entry with the same hash exists in array2 rename file

Code will be something like this: (untested, unoptimized)

$dir1 = array();
$dir2 = array();

// get hashes for dir1
foreach( glob( '/path/to/dir1/*.jpg' ) as $file ) {
 $hash = md5( file_get_contents( $file ) );
 $dir1[ $hash ] = $file;
}

// repeat for dir2 ...

foreach( $dir1 as $hash => $file1 ) {
 if( array_key_exists( $hash, $dir2 ) ) {
  rename( $file1, $dir2[ $hash ] );
 }
}
不回头走下去 2024-12-06 18:55:03

这是我的解决方案,它根据文件大小重命名 dir1 中的文件。

dir1 的内容:(

-rw-r--r--  1 haiv  staff   10 Aug 16 13:18 file1.txt
-rw-r--r--  1 haiv  staff   20 Aug 16 13:18 file2.txt
-rw-r--r--  1 haiv  staff   30 Aug 16 13:18 file3.txt
-rw-r--r--  1 haiv  staff  205 Aug 16 13:18 file4.txt

注意第五列存储文件大小。) dir2 的内容:

-rw-r--r--  1 haiv  staff   30 Aug 16 13:18 doc.txt
-rw-r--r--  1 haiv  staff  205 Aug 16 13:18 dopey.txt
-rw-r--r--  1 haiv  staff   20 Aug 16 13:18 grumpy.txt
-rw-r--r--  1 haiv  staff   10 Aug 16 13:18 happy.txt

创建一个文件调用 ~/rename.awk (是的,从主目录,以避免污染 dir1 或 dir2):

/^total/ {next} # Skip the first line (which contains the total, of ls -l)

{
    if (name[$5] == "") {
        name[$5] = $NF
        print "# File of size", $5, "should be named", $NF
    } else {
        printf "mv '%s' '%s'\n", $NF, name[$5]
    }
}

现在, cd到 dir1(如果您想重命名 dir1 中的文件),然后发出以下命令:

$ awk -f ~/rename.awk <(ls -l ../dir2) <(ls -l)

输出:

# File of size 30 should be named doc.txt
# File of size 205 should be named dopey.txt
# File of size 20 should be named grumpy.txt
# File of size 10 should be named happy.txt
mv 'file1.txt' 'happy.txt'
mv 'file2.txt' 'grumpy.txt'
mv 'file3.txt' 'doc.txt'
mv 'file4.txt' 'dopey.txt'

一旦您对结果感到满意,请将上述命令通过管道传递给 sh 来执行更改:

$ awk -f ~/rename.awk <(ls -l ../dir2) <(ls -l) | sh

注意:

  1. 无法防范带有以下内容的文件相同的尺寸。为此,wonk0 提供的 MD5 解决方案效果更好。
  2. 请在提交之前检查输出。改变是永久性的。

Here is my solution, which rename files in dir1 based on file size.

Contents of dir1:

-rw-r--r--  1 haiv  staff   10 Aug 16 13:18 file1.txt
-rw-r--r--  1 haiv  staff   20 Aug 16 13:18 file2.txt
-rw-r--r--  1 haiv  staff   30 Aug 16 13:18 file3.txt
-rw-r--r--  1 haiv  staff  205 Aug 16 13:18 file4.txt

(Note the fifth column stores the file sizes.) And the contents of dir2:

-rw-r--r--  1 haiv  staff   30 Aug 16 13:18 doc.txt
-rw-r--r--  1 haiv  staff  205 Aug 16 13:18 dopey.txt
-rw-r--r--  1 haiv  staff   20 Aug 16 13:18 grumpy.txt
-rw-r--r--  1 haiv  staff   10 Aug 16 13:18 happy.txt

Create a file call ~/rename.awk (yes, from the home directory, to avoid polluting either dir1 or dir2):

/^total/ {next} # Skip the first line (which contains the total, of ls -l)

{
    if (name[$5] == "") {
        name[$5] = $NF
        print "# File of size", $5, "should be named", $NF
    } else {
        printf "mv '%s' '%s'\n", $NF, name[$5]
    }
}

Now, cd into dir1 (if you want to rename files in dir1), and issue the following command:

$ awk -f ~/rename.awk <(ls -l ../dir2) <(ls -l)

Output:

# File of size 30 should be named doc.txt
# File of size 205 should be named dopey.txt
# File of size 20 should be named grumpy.txt
# File of size 10 should be named happy.txt
mv 'file1.txt' 'happy.txt'
mv 'file2.txt' 'grumpy.txt'
mv 'file3.txt' 'doc.txt'
mv 'file4.txt' 'dopey.txt'

Once you are happy with the result, pipe the above command to sh to execute the changes:

$ awk -f ~/rename.awk <(ls -l ../dir2) <(ls -l) | sh

Notes:

  1. No safeguard against files with the same size. For that, the MD5 solution which wonk0 offered works better.
  2. Please examine the output before you commit. Changes are permanent.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文