如何在不注册机器人帐户的情况下以编程方式从维基共享资源下载图像?

发布于 2024-08-05 12:18:50 字数 130 浏览 4 评论 0原文

机器人帐户获得批准的唯一方法似乎是添加或编辑维基媒体上已有的信息。如果您尝试在没有机器人帐户的情况下使用某些 api 库下载任何图像,您会收到错误消息而不是图像。似乎他们阻止了任何不从浏览器进入的人?其他人有这方面的经验吗?我在这里错过了什么吗?

It seems like the only way to get approval for a Bot account is if it adds to or edits information already on Wikimedia. If you try to download any images, without a bot account, using some of the api libraries out there you get error messages instead of the images. Seems like they block anyone not coming in from a browser? Anyone else have any experience with this? Am I missing something here?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

夏末 2024-08-12 12:18:50

我自己刚刚完成此操作,我觉得我应该分享:

http://www.mediawiki.org/wiki/ API:Allimages

此 API 文档确实声明您可以查询图像:

http://en.wikipedia.org/w/api.php?action=query&list=allimages&aiprop=url&format= xml&ailimit=10&aifrom=Albert

和 aiprop=url 将为您提供您要查找的图像的 url。

Having just done this myself I feel I should share:

http://www.mediawiki.org/wiki/API:Allimages

This API document does state that you can query the images:

http://en.wikipedia.org/w/api.php?action=query&list=allimages&aiprop=url&format=xml&ailimit=10&aifrom=Albert

with the aiprop=url you are given the url of the image you are looking for.

生活了然无味 2024-08-12 12:18:50

试着准确地解释一下你想做什么?
你尝试过什么?您收到什么错误消息?
你不是很清楚......

你尝试过哪些库?如果您不激进,下载 WM 内容没有任何限制。我从来没有听说过有什么限制。
一些用户代理被禁止编辑以避免愚蠢的垃圾邮件,但实际上,我从未听说过下载限制。

如果您试图抓取大量图像并通过 Commons 下载它们,那么您就错了(tm)。如果您试图获取一些图像,从 10 到 200 个,您应该能够用几行代码编写一个像样的工具,前提是您遵守限制要求:当 API 告诉您放慢速度时,如果您不这样做,系统管理员可能会将您踢出局。

如果您需要完整的图像转储(我们谈论的是几个 TB),请尝试在 上询问维基技术-l。当图像较少时,我们有种子可用,现在它更复杂,但仍然 可行

关于机器人帐户。您对系统的了解有多深?您需要一个机器人帐户才能进行快速、无人监督的编辑。机器人权限还可以提供一些便利,例如增加查询大小。但请记住:机器人帐户?它只是一个增强的用户帐户。您是否尝试过使用经典帐户运行任何东西?

Try explaining exactly what you want to do?
And what you've tried? What error message did you get?
You're not very clear...

What libraries have you tried? If you're not aggressive, there are no restrictions in downloading WM content. I've never heard of any restrictions.
Some User-Agents are banned from editing to avoid stupid spamming, but really, I've never heard of downloading restrictions.

If you are trying to scrape a massive amount of images, downloading them through Commons, you're doing it wrong (tm). If you are trying to get a few images, anywhere from 10 to 200, you should be able to write a decent tool in a few lines of code, provided that you are respecting the throttling requirement: when the API tells you to slow down, if you don't do it, sysadmins are likely to kick you out.

If you need a complete image dump, (we're talking of a few TBs) try asking on wikitech-l. We had torrents available when there were less images, now it's more complicated, but still doable.

About bot accounts. How deep have you looked in the system? You need a bot account for fast, unsupervised edits. Bot privileges also open a few facilities such as increased query sizes. But remember: bot account? it's simply an augmented user-account. Have you tried running anything with a classical account?

横笛休吹塞上声 2024-08-12 12:18:50

请注意,使用 LWP 曾经存在一个问题:它不是理想的,而是实用的,代理可以在已经不堪重负的服务器上创建大量负载。代理用户可以遵循一些明智的策略来减少负载 - 在 www.mediawiki.org 或 en:Village Pump - 技术上询问

Note that there used to be an issue with using LWP: it's not idealogical, it's practical, agents can create massive load on already stretched servers. There are sensible strategies that agent users can follow to reduce the load - ask on www.mediawiki.org, or en:Village pump - Technical

溺ぐ爱和你が 2024-08-12 12:18:50

如果您需要十到一百万个文件,使用 Magnus Manske 的工具来递归类别是一个不错的选择。 http://tools.wmflabs.org/magnustools/can_i_haz_files.html 生成一个列表UNIX 命令,您可以在本地运行这些命令。

另一种选择是 https://tools.wmflabs.org/wikilovesdownloads,其界面仅在德国,但足够简单/

If you need between ten and one million files, using Magnus Manske's tools to recurse categories is a good choice. http://tools.wmflabs.org/magnustools/can_i_haz_files.html produces a list of UNIX commands which you can then just run locally.

An alternative, whose interface is in Germany only but easy enough, is https://tools.wmflabs.org/wikilovesdownloads/

软的没边 2024-08-12 12:18:50

没有真正找到我正在寻找的答案..但此页面很有趣:: http://www.makeuseof.com/tag/4-free-tools-for-take-wikipedia-offline/

尤其是 #4.. 但该页面似乎是下来..项目死了?

Didn't really find the answer I'm looking for .. but this page is interesting:: http://www.makeuseof.com/tag/4-free-tools-for-taking-wikipedia-offline/

Especially #4.. but it seems the page is down.. project dead?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文