python Mechanagize提交响应。Read()do rether return recturn正确页面?

发布于 2025-02-02 04:51:29 字数 1484 浏览 3 评论 0原文

全部!我正在尝试在公开访问的数据库上进行一些数据处理(我知道,财产税记录,不是很迷人!)。代码运行完美,但不会返回正确的页面。当我手动输入信息时,单击“提交”,我将获得一个充满相关数据的可爱页面。但是,当我通过Python机械化提交表格时,我不会获得同一页面。例如:

brwsr = mechanize.Browser()
brwsr.open("https://sdat.dat.maryland.gov/RealProperty/Pages/default.aspx")
brwsr.select_form(nr = 0)
# <option selected="selected" value="02">ANNE ARUNDEL COUNTY</option>
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucSearchType$ddlCounty"] = ['02'] 
# STREET ADDRESS
# <option selected="selected" value="01">STREET ADDRESS</option>
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucSearchType$ddlSearchType"] = ['01']
brwsr.submit()
# this works wonderfully!

# this is now the second page
brwsr.select_form(nr = 0)
# street number
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucEnterData$txtStreenNumber"] = '523' # just an example
# street name
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucEnterData$txtStreetName"] = 'SIXTH' # just an example
response = brwsr.submit()
print(response.read())

这完全没有错误,但是返回的一些HTML如下:

<img src="/RealProperty/egov/img/ajax-loader.gif" id="imgLoader" alt="Loading... Please Wait." title="Loading... Please Wait." />\r\n 

hmmmmmmm ...“加载”? “请稍等”?第二页似乎是动态生成的,即,也许是在响应到来时还没有完成的脚本。我一直在阅读和戳,人们似乎说机械化是这种工作的错误工具。但是什么是正确的工具?我觉得在第二页上解析数据应该是直接的……如果我能动手!

谢谢您的宝贵时间!

最好的, 苏珊

all! I am trying to do some data processing on a publicly accessible database (property tax records, I know, not very glamorous!). The code runs perfectly, but it doesn't return the right page. When I enter the information into the forms manually, and click "submit" I get a lovely page full of relevant data. But when I submit the form via python mechanize, I do not get the same page. For example:

brwsr = mechanize.Browser()
brwsr.open("https://sdat.dat.maryland.gov/RealProperty/Pages/default.aspx")
brwsr.select_form(nr = 0)
# <option selected="selected" value="02">ANNE ARUNDEL COUNTY</option>
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucSearchType$ddlCounty"] = ['02'] 
# STREET ADDRESS
# <option selected="selected" value="01">STREET ADDRESS</option>
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucSearchType$ddlSearchType"] = ['01']
brwsr.submit()
# this works wonderfully!

# this is now the second page
brwsr.select_form(nr = 0)
# street number
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucEnterData$txtStreenNumber"] = '523' # just an example
# street name
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucEnterData$txtStreetName"] = 'SIXTH' # just an example
response = brwsr.submit()
print(response.read())

This returns no errors at all, but some of the html returned is the following:

<img src="/RealProperty/egov/img/ajax-loader.gif" id="imgLoader" alt="Loading... Please Wait." title="Loading... Please Wait." />\r\n 

Hmmmmmmm... "Loading"? "Please wait"? The second page seems to be dynamically generated, i.e., maybe it is running scripts that haven't finished by the time the response come. I've been reading and poking, and people seem to say that mechanize is the wrong tool for this kind of job. But what is the right tool? I feel like it ought to be straight-forward to parse the data on that second page... if I can just get my hands on it!

Thank you for your time!

Best,
Susan

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

薄凉少年不暖心 2025-02-09 04:51:29

尝试brwsr.response.read()

brwsr = mechanize.Browser()
brwsr.open("https://sdat.dat.maryland.gov/RealProperty/Pages/default.aspx")
brwsr.select_form(nr = 0)
# <option selected="selected" value="02">ANNE ARUNDEL COUNTY</option>
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucSearchType$ddlCounty"] = ['02'] 
# STREET ADDRESS
# <option selected="selected" value="01">STREET ADDRESS</option>
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucSearchType$ddlSearchType"] = ['01']
brwsr.submit()
# this works wonderfully!

# this is now the second page
brwsr.select_form(nr = 0)
# street number
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucEnterData$txtStreenNumber"] = '523' # just an example
# street name
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucEnterData$txtStreetName"] = 'SIXTH' # just an example
brwsr.submit()
print(brwsr.response.read())

Try brwsr.response.read()

brwsr = mechanize.Browser()
brwsr.open("https://sdat.dat.maryland.gov/RealProperty/Pages/default.aspx")
brwsr.select_form(nr = 0)
# <option selected="selected" value="02">ANNE ARUNDEL COUNTY</option>
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucSearchType$ddlCounty"] = ['02'] 
# STREET ADDRESS
# <option selected="selected" value="01">STREET ADDRESS</option>
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucSearchType$ddlSearchType"] = ['01']
brwsr.submit()
# this works wonderfully!

# this is now the second page
brwsr.select_form(nr = 0)
# street number
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucEnterData$txtStreenNumber"] = '523' # just an example
# street name
brwsr.form["ctl00$cphMainContentArea$ucSearchType$wzrdRealPropertySearch$ucEnterData$txtStreetName"] = 'SIXTH' # just an example
brwsr.submit()
print(brwsr.response.read())
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文