Mechanze 表单提交导致“断言错误”尝试 .read() 时响应

发布于 2024-11-18 07:24:30 字数 3116 浏览 2 评论 0原文

我正在用 python 编写一个网络爬虫程序,但无法使用 mechanize 登录。网站上的表单如下所示:

   <form method="post" action="PATLogon">
   <h2 align="center"><img src="/myaladin/images/aladin_logo_rd.gif"></h2>
   <!-- ALADIN Request parameters -->
  <input type=hidden name=req value="db">
  <input type=hidden name=key value="PROXYAUTH">
  <input type=hidden name=url value="http://eebo.chadwyck.com/search">
  <input type=hidden name=lib value="8">    
<table>
<tr><td><b>Last Name:</b></td>
    <td><input name=LN size=20 maxlength=26></td>
<tr><td><b>University ID or Library Barcode:</b></td>
    <td><input type=password name=BC size=20 maxlength=21></td>
<tr><td><b>Institution:</b></td>
    <td><select name="INST">
        <option value="??">Select University ----</option>
        <option value="AU">American</option>
        <option value="CU">Catholic</option>
        <option value="DC">District of Columbia</option>
        <option value="GA">Gallaudet</option>
        <option value="GM">George Mason</option>
        <option value="GW">George Washington</option>
        <option value="GT">Georgetown</option>
        <option value="MU">Marymount</option>
        <option value="TR">Trinity</option>
        </select>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
        <input type="submit" value="GO">
    </td></tr></table></form>

因此,我可以适当地设置所有内容,但在提交表单并尝试打印响应时,我留下了错误。我的代码如下:

 import mechanize
 import time
 br = mechanize.Browser()
 br.set_handle_robots(False)

 def connect():
     # connection information                                                    
     url = "https://www.aladin.wrlc.org/Z-WEB/Aladin?req=db&key=PROXYAUTH&lib=8&\url=http://eebo.chadwyck.com/search"
     br.open(url)
     time.sleep(0.5)
     br.select_form(nr=0)
     br["LN"] = "Reese"
     br["BC"] = "myPassword"
     br["INST"] = ["AU"]
     response = br.submit()
     print response.getheaders()

我在这里得到的错误是:

 >>> eebolib.connect()
 Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "eebolib.py", line 28, in connect
     print response.read()
   File "build/bdist.macosx-10.5-fat3/egg/mechanize/_response.py", line 190, in read
   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 349, in read
     data = self._sock.recv(rbufsize)
   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 553, in read
     if self.length is not None:
   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1282, in read
     if amt is None or amt > self._line_left:
 AssertionError

如果有人可以提供一些帮助,我将非常感激。

I am writing a web-crawl program with python and am unable to login using mechanize. The form on the site looks like:

   <form method="post" action="PATLogon">
   <h2 align="center"><img src="/myaladin/images/aladin_logo_rd.gif"></h2>
   <!-- ALADIN Request parameters -->
  <input type=hidden name=req value="db">
  <input type=hidden name=key value="PROXYAUTH">
  <input type=hidden name=url value="http://eebo.chadwyck.com/search">
  <input type=hidden name=lib value="8">    
<table>
<tr><td><b>Last Name:</b></td>
    <td><input name=LN size=20 maxlength=26></td>
<tr><td><b>University ID or Library Barcode:</b></td>
    <td><input type=password name=BC size=20 maxlength=21></td>
<tr><td><b>Institution:</b></td>
    <td><select name="INST">
        <option value="??">Select University ----</option>
        <option value="AU">American</option>
        <option value="CU">Catholic</option>
        <option value="DC">District of Columbia</option>
        <option value="GA">Gallaudet</option>
        <option value="GM">George Mason</option>
        <option value="GW">George Washington</option>
        <option value="GT">Georgetown</option>
        <option value="MU">Marymount</option>
        <option value="TR">Trinity</option>
        </select>
              
        <input type="submit" value="GO">
    </td></tr></table></form>

So, I am able to set everything appropriately but on submitting the form and attempting to print the response I am left with an error. My code is as follows:

 import mechanize
 import time
 br = mechanize.Browser()
 br.set_handle_robots(False)

 def connect():
     # connection information                                                    
     url = "https://www.aladin.wrlc.org/Z-WEB/Aladin?req=db&key=PROXYAUTH&lib=8&\url=http://eebo.chadwyck.com/search"
     br.open(url)
     time.sleep(0.5)
     br.select_form(nr=0)
     br["LN"] = "Reese"
     br["BC"] = "myPassword"
     br["INST"] = ["AU"]
     response = br.submit()
     print response.getheaders()

The error I get here is:

 >>> eebolib.connect()
 Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "eebolib.py", line 28, in connect
     print response.read()
   File "build/bdist.macosx-10.5-fat3/egg/mechanize/_response.py", line 190, in read
   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 349, in read
     data = self._sock.recv(rbufsize)
   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 553, in read
     if self.length is not None:
   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1282, in read
     if amt is None or amt > self._line_left:
 AssertionError

If anyone can provide some assistance on this I would be most appreciative.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

長街聽風 2024-11-25 07:24:31

这是我找到的解决方案:

import mechanize,urllib,ClientForm,cookielib,re,os,time
from BeautifulSoup import BeautifulSoup

cookies = mechanize.CookieJar()
opener = mechanize.build_opener(mechanize.HTTPCookieProcessor(cookies))
headers = [("Accept","text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"),\
           ("Accept-Charset","ISO-8859-1,utf-8;q=0.7,*;q=0.7"),\
           ("Accept-Encoding","gzip, deflate"),\
           ("Accept-Language","en-us,en;q=0.5"),\
           ("Connection","keep-alive"),\
           ("Host","www.aladin.wrlc.org"),\
           ("Referer","https://www.aladin.wrlc.org/Z-WEB/Aladin?req=db&key=PROXYAUTHlib=8url=http://eebo.chadwyck.com/search"),\
           ("User-Agent","Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:5.0) Gecko/20100101 Firefox/5.0")]
opener.addheaders = headers
mechanize.install_opener(opener)
params = urllib.urlencode({'LN':'myLN','BC':'myBC','INST':'myINST',\
                           'req':'db','key':'PROXYAUTH','lib':'8',\
                           'url':'http://eebo.chadwyck.com/search'})
mechanize.urlopen("https://www.aladin.wrlc.org/Z-WEB/PATLogon",params)

希望有一天这可以帮助别人:)

This is the solution that I found:

import mechanize,urllib,ClientForm,cookielib,re,os,time
from BeautifulSoup import BeautifulSoup

cookies = mechanize.CookieJar()
opener = mechanize.build_opener(mechanize.HTTPCookieProcessor(cookies))
headers = [("Accept","text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"),\
           ("Accept-Charset","ISO-8859-1,utf-8;q=0.7,*;q=0.7"),\
           ("Accept-Encoding","gzip, deflate"),\
           ("Accept-Language","en-us,en;q=0.5"),\
           ("Connection","keep-alive"),\
           ("Host","www.aladin.wrlc.org"),\
           ("Referer","https://www.aladin.wrlc.org/Z-WEB/Aladin?req=db&key=PROXYAUTHlib=8url=http://eebo.chadwyck.com/search"),\
           ("User-Agent","Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:5.0) Gecko/20100101 Firefox/5.0")]
opener.addheaders = headers
mechanize.install_opener(opener)
params = urllib.urlencode({'LN':'myLN','BC':'myBC','INST':'myINST',\
                           'req':'db','key':'PROXYAUTH','lib':'8',\
                           'url':'http://eebo.chadwyck.com/search'})
mechanize.urlopen("https://www.aladin.wrlc.org/Z-WEB/PATLogon",params)

Hope this helps someone someday :)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文