Python Mechanize:会话已过期
尝试机械化从 https asp 网站上抓取一些内容,看起来登录页面提交工作正常,因为我返回了 200。但是当我尝试使用登录后捕获的 cookie 打开一个 url 时,我得到了重新 -定向回登录页面,并显示我的会话已过期的错误。最后打印只是为了让我可以看到我被重定向了。
import mechanize
USER_AGENT = "Mozilla/5.0 (X11; U; Linux i686; tr-TR; rv:1.8.1.9) Gecko/20071102 Pardus/2007 Firefox/2.0.0.9"
mech = mechanize.Browser()
mech.addheaders = [("User-agent", USER_AGENT)]
mech.open("https://www.example.com/login.asp")
mech.select_form("loginform")
mech['id'] = "blah"
mech['pin'] = "blah"
response = mech.submit()
trueContent = mech.open("https://www.example.com/content")
print trueContent.geturl()
Trying out mechanize to scrape some content off an https asp site, it looks as if the login page submission works as I get returned a 200. But when I try and open a url presumably using the cookie captured after the login, I get re-directed back to the login page with the error that my session has expired. The last print is just so I can see that I get redirected.
import mechanize
USER_AGENT = "Mozilla/5.0 (X11; U; Linux i686; tr-TR; rv:1.8.1.9) Gecko/20071102 Pardus/2007 Firefox/2.0.0.9"
mech = mechanize.Browser()
mech.addheaders = [("User-agent", USER_AGENT)]
mech.open("https://www.example.com/login.asp")
mech.select_form("loginform")
mech['id'] = "blah"
mech['pin'] = "blah"
response = mech.submit()
trueContent = mech.open("https://www.example.com/content")
print trueContent.geturl()
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您的代码对我来说看起来不错,但是我没有看到任何登录是否成功的检查
查看响应的内容以确保您的登录是否成功。
您确定该网站无需 JavaScript 即可运行吗?可能存在由 javascript 设置的隐藏字段。
Your code looks good to me however I don't see any check that login was successful
Look at the content of response to make sure your login was successful.
Are you sure this site works without javascript? There could be hidden field that are set by javascript.