virtualenv 之间出现莫名其妙的 Urllib2 问题。
我有一些测试代码(作为 web 应用程序的一部分),它使用 urllib2 来执行我通常通过浏览器执行的操作:
- 登录到远程网站
- 移动到另一个页面
- 通过填写
我创建的 表单来执行 POST 3 个不同的机器上有 4 个独立的、干净的 virtualenv(带有 --no-site-packages),所有机器都有不同版本的 python,但包完全相同(通过 pip 要求文件),并且代码仅适用于我本地的两个 virtualenv发展机器(2.6.1 和 2.7.2) - 它不适用于我的任何一个生产 VPS
在失败的情况下,我可以成功登录,移至正确的页面,但当我提交表单时,远程服务器回复告诉我发现出现了一个错误 - 这是一个应用程序服务器错误页面(“我们无法完成您的请求”),而不是网络服务器错误。
- 因为我可以成功登录并操纵到第二页,这似乎不是会话或 cookie 问题 - 它是最终 POST 所特有的,
- 因为我可以使用完全相同的标头和数据在特定计算机上执行操作,这似乎不是我请求/发布的问题
- ,因为我正在从不同公司租用的两个单独的 VPS 上尝试代码,这似乎不是 VPS 物理环境的问题
- ,因为代码有效在2个不同的python上版本,我无法想象这是一个不兼容的问题,
我在这个阶段完全迷失了为什么这不起作用。我什至“将其关闭并再次打开”,因为我只是看不出问题所在。
我认为这一定与来自远程服务器不喜欢的 VPS 的最终 POST 有关,但我不知道那可能是什么。我觉得 URLlib 背后发生了一些事情,导致远程服务器不喜欢回复。
编辑 我在 VPS 上安装了与我的工作本地副本完全相同的 Python 版本 (2.6.1),但它不能远程工作,因此它一定与源自 VPS 相关。这对 Http 请求有何影响?是不是级别比较低的东西?
I have some test code (as a part of a webapp) that uses urllib2 to perform an operation I would usually perform via a browser:
- Log in to a remote website
- Move to another page
- Perform a POST by filling in a form
I've created 4 separate, clean virtualenvs (with --no-site-packages) on 3 different machines, all with different versions of python but the exact same packages (via pip requirements file), and the code only works on the two virtualenvs on my local development machine(2.6.1 and 2.7.2) - it won't work on either of my production VPSs
In the failing cases, I can log in successfully, move to the correct page but when I submit the form, the remote server replies telling me that there has been an error - it's an application server error page ('we couldn't complete your request') and not a webserver error.
- because I can successfully log in and maneuver to a second page, this doesn't seem to be a session or a cookie problem - it's particular to the final POST
- because I can perform the operation on a particular machine with the EXACT same headers and data, this doesn't seem to be a problem with what I am requesting/posting
- because I am trying the code on two separate VPS rented from different companies, this doesn't seem to be a problem with the VPS physical environment
- because the code works on 2 different python versions, I can't imagine it being an incompabilty problem
I'm completely lost at this stage as to why this wouldn't work. I've even 'turned-it-off-and-turn-it-on-again' because I just can't see what the problem could be.
I think it has to be something to do with the final POST coming from a VPS that the remote server doesn't like, but I can't figure out what that could be. I feel like there is something going on under the hood of URLlib that is causing the remote server to dislike the reply.
EDIT
I've installed the exact same Python version (2.6.1) on the VPS as is on my working local copy and it doesn't work remotely, so it must be something to do with originating from a VPS. How could this effect the Http request? Is it something lower level?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您可以尝试为 urllib2 设置 debuglevel=1 并查看它会产生什么结果:
You might try setting the debuglevel=1 for urllib2 and see what it comes up with:
这完全是在黑暗中拍摄的,但您的 VPS 是 64 位,而您的家用计算机是 32 位,还是反之亦然?也许某些东西的默认大小或精度的差异可能会吓坏服务器。
除此之外,您能否尝试找出有关 Web 服务器正在使用的软件堆栈的任何信息?
This is a total shot in the dark, but are your VPSs 64-bit and your home computer 32-bit, or vice versa? Maybe a difference in default sizes or accuracies of something could be freaking out the server.
Barring that, can you try to find out any information on the software stack the web server is using?
我对 urllib2(使用 Zimbra 的 REST api)也有类似的问题,最终成功切换到 pycurl。
聚苯乙烯
对于登录/导航/发布等操作,我通常发现 Mechanize 很有用且更易于使用。也许你可以表演一下。
I had similar issues with urllib2 (working with Zimbra's REST api), in the end switched to pycurl with success.
PS
for operations of login/navigate/post, I usually find Mechanize useful and easier to use. Maybe you can give it a show.
嗯,看起来我知道问题发生的原因,但我并不是 100% 知道问题的原因。
我只需让服务器在发送第二个请求(移动到另一个页面)之后等待(time.sleep()),然后再执行第三个请求(通过填写表格)。
我不知道这是因为第 3 方服务器的情况,还是 URLlib 的某种奇怪问题?它似乎在我的开发机器上工作的原因可能是因为它运行代码的速度比服务器慢?
Well, it looks like I know why the problem was happening, but I'm not 100% the reason for it.
I simply had to make the server wait (time.sleep()) after it sent the 2nd request (Move to another page) before doing the 3rd request (Perform a POST by filling in a form).
I don't know is it because of a condition with the 3rd party server, or if it's some sort of odd issue with URLlib? The reason it seemed to work on my development machine is presumably because it was slower then the server at running the code?