如何以编程方式检查图像(PNG、JPEG 或 GIF)是否已损坏?

发布于 2024-08-04 11:38:00 字数 451 浏览 3 评论 0原文

好的。所以我有大约 250,000 张高分辨率图像。我想做的就是检查所有这些并找到已损坏的。如果您知道 4scrape 是什么,那么您就知道图像的本质。

对我来说,损坏的图像是加载到 Firefox 中后显示的

无法显示图像“某某图像”,因为它包含错误。

现在,我可以选择所有 250,000 张图像(约 150GB)并将它们拖放到 Firefox 中。但这会很糟糕,因为我认为 Mozilla 设计的 Firefox 并不是为了打开 250,000 个选项卡。不,我需要一种方法来以编程方式检查图像是否已损坏。

有谁知道可以做这些事情的 PHP 或 Python 库吗?或者适用于 Windows 的现有软件?

我已经删除了明显损坏的图像(例如 0 字节的图像),但我大约 99.9% 确信在我的收藏中还有更多有病的图像。

Okay. So I have about 250,000 high resolution images. What I want to do is go through all of them and find ones that are corrupted. If you know what 4scrape is, then you know the nature of the images I.

Corrupted, to me, is the image is loaded into Firefox and it says

The image “such and such image” cannot be displayed, because it contains errors.

Now, I could select all of my 250,000 images (~150gb) and drag-n-drop them into Firefox. That would be bad though, because I don't think Mozilla designed Firefox to open 250,000 tabs. No, I need a way to programmatically check whether an image is corrupted.

Does anyone know a PHP or Python library which can do something along these lines? Or an existing piece of software for Windows?

I have already removed obviously corrupted images (such as ones that are 0 bytes) but I'm about 99.9% sure that there are more diseased images floating around in my throng of a collection.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

走过海棠暮 2024-08-11 11:38:00

一种简单的方法是尝试使用 PIL(Python 成像库)加载和验证文件。

from PIL import Image

v_image = Image.open(file)
v_image.verify()

捕获异常...

来自文档

im.verify()

尝试确定文件是否损坏,而不实际解码图像数据。如果此方法发现任何问题,它会引发适当的异常。该方法仅适用于新打开的图像;如果图像已经加载,则结果未定义。另外,如果使用此方法后需要加载图片,则必须重新打开图片文件。

An easy way would be to try loading and verifying the files with PIL (Python Imaging Library).

from PIL import Image

v_image = Image.open(file)
v_image.verify()

Catch the exceptions...

From the documentation:

im.verify()

Attempts to determine if the file is broken, without actually decoding the image data. If this method finds any problems, it raises suitable exceptions. This method only works on a newly opened image; if the image has already been loaded, the result is undefined. Also, if you need to load the image after using this method, you must reopen the image file.

薄荷梦 2024-08-11 11:38:00

我建议你查看 imagemagick :http://www.imagemagick.org/

那里有一个工具称为识别,您可以将其与脚本/标准输出结合使用,也可以使用提供的编程接口

i suggest you check out imagemagick for this: http://www.imagemagick.org/

there you have a tool called identify which you can either use in combination with a script/stdout or you can use the programming interface provided

夕色琉璃 2024-08-11 11:38:00

在 PHP 中,使用 exif_imagetype()

if (exif_imagetype($filename) === false)
{
    unlink($filename); // image is corrupted
}

编辑:或者您可以尝试使用 ImageCreateFromString() 完全加载图像:

if (ImageCreateFromString(file_get_contents($filename)) === false)
{
    unlink($filename); // image is corrupted
}

将返回图像资源
成功。 如果返回 FALSE
不支持图像类型,数据为
不是可识别的格式,或者
图像已损坏
并且无法加载。

In PHP, with exif_imagetype():

if (exif_imagetype($filename) === false)
{
    unlink($filename); // image is corrupted
}

EDIT: Or you can try to fully load the image with ImageCreateFromString():

if (ImageCreateFromString(file_get_contents($filename)) === false)
{
    unlink($filename); // image is corrupted
}

An image resource will be returned on
success. FALSE is returned if the
image type is unsupported, the data is
not in a recognized format, or the
image is corrupt
and cannot be loaded.

莫言歌 2024-08-11 11:38:00

如果您的确切要求是它在 FireFox 中正确显示,您可能会遇到困难 - 唯一确定的方法是链接到与 FireFox 完全相同的图像加载源代码。

只需尝试使用任意数量的图像库打开文件即可检测到基本图像损坏(文件不完整)。

然而,许多图像可能无法显示,仅仅是因为它们拉伸了您所使用的特定查看器无法处理的文件格式的一部分(特别是 GIF 有很多这样的边缘情况,但您可以找到 JPEG 和罕见的 PNG 文件只能在特定查看器中显示)。还有一些丑陋的 JPEG 边缘情况,其中文件在查看器 X 中似乎未损坏,但实际上文件已被缩短并且仅能正确显示,因为丢失的信息很少(FireFox 可以正确显示一些被截断的 JPEG [你会得到一个灰色的底部],但其他的会导致 FireFox 似乎将它们加载到一半,然后显示错误消息而不是部分图像)

If your exact requirements are that it show correctly in FireFox you may have a difficult time - the only way to be sure would be to link to the exact same image loading source code as FireFox.

Basic image corruption (file is incomplete) can be detected simply by trying to open the file using any number of image libraries.

However many images can fail to display simply because they stretch a part of the file format that the particular viewer you are using can't handle (GIF in particular has a lot of these edge cases, but you can find JPEG and the rare PNG file that can only be displayed in specific viewers). There are also some ugly JPEG edge cases where the file appears to be uncorrupted in viewer X, but in reality the file has been cut short and is only displaying correctly because very little information has been lost (FireFox can show some cut off JPEGs correctly [you get a grey bottom], but others result in FireFox seeming the load them half way and then display the error message instead of the partial image)

白云悠悠 2024-08-11 11:38:00

如果 imagemagick 可用,您可以使用它:

如果您想检查整个文件夹,

identify "./myfolder/*" >log.txt 2>&1

如果您只想检查文件:

identify myfile.jpg

You could use imagemagick if it is available:

if you want to do a whole folder

identify "./myfolder/*" >log.txt 2>&1

if you want to just check a file:

identify myfile.jpg
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文