使用Python/mechanize select_form()时出错？

发布于 2024-08-18 06:41:22 字数 1718 浏览 5 评论 0原文

我正在尝试从网站上删除一些数据。我正在尝试编写的脚本应该获取页面的内容：

http://www.atpworldtour.com/Rankings/Singles.aspx

应该模拟用户通过附加排名和日期的每个选项并模拟单击“Go”，然后在获取数据后应该使用后退功能。

现在，我一直在尝试为附加地位选择此选项：

            <option value="101" >101-200</option>

这是我尝试执行此操作的（糟糕的）尝试：

from mechanize import Browser
from BeautifulSoup import BeautifulSoup
import re
import urllib2



br = Browser();
br.open("http://www.atpworldtour.com/Rankings/Singles.aspx");
br.select_form(nr=0);
br["r"] = "101";

response = br.submit();

但是，它在应该选择第一个表单的 select_form(nr=0) 上失败了。

这是Python返回的日志：

>>> from mechanize import Browser
>>>
>>> from BeautifulSoup import BeautifulSoup
>>> import re
>>> import urllib2
>>>
>>>
>>>
>>> br = Browser();
>>> br.open("http://www.atpworldtour.com/Rankings/Singles.aspx");
<response_seek_wrapper at 0x311bb48L whose wrapped object = <closeable_response
at 0x311be88L whose fp = <socket._fileobject object at 0x0000000002C94408>>>
>>> br.select_form(nr=0);
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "build\bdist.win-amd64\egg\mechanize\_mechanize.py", line 505, in select_
form
  File "build\bdist.win-amd64\egg\mechanize\_html.py", line 546, in __getattr__
  File "build\bdist.win-amd64\egg\mechanize\_html.py", line 559, in forms
  File "build\bdist.win-amd64\egg\mechanize\_html.py", line 228, in forms
mechanize._html.ParseError

我在mechanize主页中找不到所有功能的正确解释。任何人都可以向我指出使用表单和 Mechanize 的正确教程，或者帮助我解决这个特定问题吗？

安东尼

原文

I am trying to scrap some data from a website.
The scripts I am trying to write, should get the content of the page:

http://www.atpworldtour.com/Rankings/Singles.aspx

Should simulate the user going trough every option for Additional Standings and the dates and simulate clicking on Go then after fetching the data should use the back function.

For now I have been trying to just select this option for Additional Standing:

            <option value="101" >101-200</option>

Here is my (poor) attempt to try to do this:

from mechanize import Browser
from BeautifulSoup import BeautifulSoup
import re
import urllib2



br = Browser();
br.open("http://www.atpworldtour.com/Rankings/Singles.aspx");
br.select_form(nr=0);
br["r"] = "101";

response = br.submit();

However it just fails on the select_form(nr=0) which should be selecting the first form.

This is the log returned by Python:

>>> from mechanize import Browser
>>>
>>> from BeautifulSoup import BeautifulSoup
>>> import re
>>> import urllib2
>>>
>>>
>>>
>>> br = Browser();
>>> br.open("http://www.atpworldtour.com/Rankings/Singles.aspx");
<response_seek_wrapper at 0x311bb48L whose wrapped object = <closeable_response
at 0x311be88L whose fp = <socket._fileobject object at 0x0000000002C94408>>>
>>> br.select_form(nr=0);
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "build\bdist.win-amd64\egg\mechanize\_mechanize.py", line 505, in select_
form
  File "build\bdist.win-amd64\egg\mechanize\_html.py", line 546, in __getattr__
  File "build\bdist.win-amd64\egg\mechanize\_html.py", line 559, in forms
  File "build\bdist.win-amd64\egg\mechanize\_html.py", line 228, in forms
mechanize._html.ParseError

I could not find a proper explanation of all the functions in the mechanize home page. Can anyone either point me to a proper tutorial for using forms and Mechanize or help me on this particular issue ?

Anthony

分享到QQ

分享到微博