Mechanze 表单提交导致“断言错误”尝试 .read() 时响应
我正在用 python 编写一个网络爬虫程序,但无法使用 mechanize 登录。网站上的表单如下所示:
<form method="post" action="PATLogon">
<h2 align="center"><img src="/myaladin/images/aladin_logo_rd.gif"></h2>
<!-- ALADIN Request parameters -->
<input type=hidden name=req value="db">
<input type=hidden name=key value="PROXYAUTH">
<input type=hidden name=url value="http://eebo.chadwyck.com/search">
<input type=hidden name=lib value="8">
<table>
<tr><td><b>Last Name:</b></td>
<td><input name=LN size=20 maxlength=26></td>
<tr><td><b>University ID or Library Barcode:</b></td>
<td><input type=password name=BC size=20 maxlength=21></td>
<tr><td><b>Institution:</b></td>
<td><select name="INST">
<option value="??">Select University ----</option>
<option value="AU">American</option>
<option value="CU">Catholic</option>
<option value="DC">District of Columbia</option>
<option value="GA">Gallaudet</option>
<option value="GM">George Mason</option>
<option value="GW">George Washington</option>
<option value="GT">Georgetown</option>
<option value="MU">Marymount</option>
<option value="TR">Trinity</option>
</select>
<input type="submit" value="GO">
</td></tr></table></form>
因此,我可以适当地设置所有内容,但在提交表单并尝试打印响应时,我留下了错误。我的代码如下:
import mechanize
import time
br = mechanize.Browser()
br.set_handle_robots(False)
def connect():
# connection information
url = "https://www.aladin.wrlc.org/Z-WEB/Aladin?req=db&key=PROXYAUTH&lib=8&\url=http://eebo.chadwyck.com/search"
br.open(url)
time.sleep(0.5)
br.select_form(nr=0)
br["LN"] = "Reese"
br["BC"] = "myPassword"
br["INST"] = ["AU"]
response = br.submit()
print response.getheaders()
我在这里得到的错误是:
>>> eebolib.connect()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "eebolib.py", line 28, in connect
print response.read()
File "build/bdist.macosx-10.5-fat3/egg/mechanize/_response.py", line 190, in read
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 349, in read
data = self._sock.recv(rbufsize)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 553, in read
if self.length is not None:
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1282, in read
if amt is None or amt > self._line_left:
AssertionError
如果有人可以提供一些帮助,我将非常感激。
I am writing a web-crawl program with python and am unable to login using mechanize. The form on the site looks like:
<form method="post" action="PATLogon">
<h2 align="center"><img src="/myaladin/images/aladin_logo_rd.gif"></h2>
<!-- ALADIN Request parameters -->
<input type=hidden name=req value="db">
<input type=hidden name=key value="PROXYAUTH">
<input type=hidden name=url value="http://eebo.chadwyck.com/search">
<input type=hidden name=lib value="8">
<table>
<tr><td><b>Last Name:</b></td>
<td><input name=LN size=20 maxlength=26></td>
<tr><td><b>University ID or Library Barcode:</b></td>
<td><input type=password name=BC size=20 maxlength=21></td>
<tr><td><b>Institution:</b></td>
<td><select name="INST">
<option value="??">Select University ----</option>
<option value="AU">American</option>
<option value="CU">Catholic</option>
<option value="DC">District of Columbia</option>
<option value="GA">Gallaudet</option>
<option value="GM">George Mason</option>
<option value="GW">George Washington</option>
<option value="GT">Georgetown</option>
<option value="MU">Marymount</option>
<option value="TR">Trinity</option>
</select>
<input type="submit" value="GO">
</td></tr></table></form>
So, I am able to set everything appropriately but on submitting the form and attempting to print the response I am left with an error. My code is as follows:
import mechanize
import time
br = mechanize.Browser()
br.set_handle_robots(False)
def connect():
# connection information
url = "https://www.aladin.wrlc.org/Z-WEB/Aladin?req=db&key=PROXYAUTH&lib=8&\url=http://eebo.chadwyck.com/search"
br.open(url)
time.sleep(0.5)
br.select_form(nr=0)
br["LN"] = "Reese"
br["BC"] = "myPassword"
br["INST"] = ["AU"]
response = br.submit()
print response.getheaders()
The error I get here is:
>>> eebolib.connect()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "eebolib.py", line 28, in connect
print response.read()
File "build/bdist.macosx-10.5-fat3/egg/mechanize/_response.py", line 190, in read
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 349, in read
data = self._sock.recv(rbufsize)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 553, in read
if self.length is not None:
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1282, in read
if amt is None or amt > self._line_left:
AssertionError
If anyone can provide some assistance on this I would be most appreciative.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这是我找到的解决方案:
希望有一天这可以帮助别人:)
This is the solution that I found:
Hope this helps someone someday :)