MediaWiki API:嵌入/删除不相关图标的图像的大小
我使用 MediaWiki API 来查找 Wikipedia 文章的图像。然而,我也得到了所有无用的图标,比如当文章需要清理时的扫帚,或者标记要放置在知识共享许可下的内容的知识共享徽标。
有没有办法检测哪些图像是此类图标,以便我可以删除它们?例如,有没有一种方法可以查询嵌入图像的大小(而不是原始图像的大小,即使对于图标来说也可能很大),以便我可以删除所有小的图像。无论如何,我对非常小的图像并不感兴趣。
I use the MediaWiki API to find images of Wikipedia articles. However, I also get all the useless icons, like the broom for when a article needs to be cleaned up or the creative commons logo that marks something to be placed under a creative commons license.
Is there a way to detect which images are such icons so I can drop them? E.g. is there a way to query the size at which the image was embedded (rather then the size of the original image, which might be huge even for icons) so that I can drop all small ones. I'm not really interested in very small images anyway.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
据我所知,没有。该信息根本不存储在数据库中,因此也无法通过 API 获取。
您可能可以做的一些事情包括:
加载文章的 HTML 标记(通过 API
action=parse
,或者直接通过 index.php 和action=render
)并从中提取图像尺寸。只需构建应排除的图像列表即可。您可以通过编程方式执行此操作(例如,查找类别:维基百科维护模板 及其所有子类别),或者只是在遇到任何不需要的图像时将其添加到排除列表中。
As far as I know, no. That information is simply not stored in the database, and is therefore also not available via the API.
Some things you could perhaps do include:
Load the HTML markup of the article (via the API
action=parse
, or simply via index.php withaction=render
) and extract the image sizes from it.Simply build a list of images that should be excluded. You could do this programmatically (e.g. find all images used on all templates included in Category:Wikipedia maintenance templates and all its subcategories) or just add any unwanted images to the exclusion list as you come across them.