python mechanize._html.ParseError

发布于 2024-12-27 18:06:10 字数 507 浏览 0 评论 0原文

当我运行下面的代码时,我收到 mechanize._html.ParseError 异常。 我该如何让它闭嘴?我知道它是无效的 html,如果它是一个不错的网站,我就不想解析它。我进行了谷歌搜索,并被告知将 br = mechanize.Browser() 替换为 br = mechanize.Browser(factory=mechanize.RobustFactory()),但这并没有不工作。

import mechanize

#br  = mechanize.Browser()
br = mechanize.Browser(factory=mechanize.RobustFactory())
br.set_handle_robots(False)
br.open("http://journeyplanner.irishrail.ie/bin/query.exe")
for form in br.forms():
        print form
        print

When I run the code below, I get a mechanize._html.ParseError exception.
How do I make it shut up? I know it's invalid html, I wouldn't want to parse it if it was a nice website. I did google around, and was told to replace br = mechanize.Browser() with br = mechanize.Browser(factory=mechanize.RobustFactory()), but that didn't work.

import mechanize

#br  = mechanize.Browser()
br = mechanize.Browser(factory=mechanize.RobustFactory())
br.set_handle_robots(False)
br.open("http://journeyplanner.irishrail.ie/bin/query.exe")
for form in br.forms():
        print form
        print

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

亢潮 2025-01-03 18:06:11

为什么要使用 mechanize 打开 .exe 文件?您应该使用它打开网页。如果您想下载 .exe 文件,请使用 br.retrieve()

编辑:

顺便说一句,您的代码为我生成了以下输出:

<formular POST http://journeyplanner.irishrail.ie/bin/query.exe/dn?ld=1.1&OK#focus application/x-www-form-urlencoded
    <HiddenControl(queryPageDisplayed=yes) (readonly)>
    <HiddenControl(HWAI=JS!ajax=yes) (disabled, readonly)>
    <HiddenControl(HWAI=JS!js=yes) (disabled, readonly)>
    <HiddenControl(outwardConDetails=) (readonly)>
    <ImageControl(start=Verbindung suchen)>
    <TextControl(REQ0JourneyStopsS0A=255)>
    <TextControl(REQ0JourneyStopsS0G=)>
    <HiddenControl(REQ0JourneyStopsS0ID=) (readonly)>
    <TextControl(REQ0JourneyStopsZ0A=255)>
    <TextControl(REQ0JourneyStopsZ0G=)>
    <HiddenControl(REQ0JourneyStopsZ0ID=) (readonly)>
    <RadioControl(journey_mode=[*single, return])>
    <TextControl(REQ0JourneyDate=17/01/2012)>
    <SelectControl(REQ0JourneyTime=[*0, 00, 9, 14, 18])>
    <HiddenControl(REQ0HafasPeriodToSearch=1440) (readonly)>
    <HiddenControl(REQ0HafasPeriodSearch=2) (readonly)>
    <HiddenControl(REQ0HafasSearchForw=1) (readonly)>
    <CheckboxControl(special_search_both=[1])>
    <TextControl(REQ1JourneyDate=)>
    <SelectControl(REQ1JourneyTime=[*0, 00, 9, 14, 18])>
    <HiddenControl(REQ1HafasPeriodToSearch=1440) (readonly)>
    <HiddenControl(REQ1HafasPeriodSearch=2) (readonly)>
    <HiddenControl(REQ1HafasSearchForw=1) (readonly)>
    <SubmitControl(start=Go) (readonly)>
    <SubmitControl(start=Go) (readonly)>>

编辑:

哦,我错了......它根本不是 .exe 文件。我下载了它并用文本编辑器打开,它只是一个 .html 文件!它也适用于 br = mechanize.Browser()

Why are you opening a .exe file with mechanize? You're supposed to open web pages using that. If you want to download the .exe file, use br.retrieve() instead.

Edit:

BTW, your code generated this output for me:

<formular POST http://journeyplanner.irishrail.ie/bin/query.exe/dn?ld=1.1&OK#focus application/x-www-form-urlencoded
    <HiddenControl(queryPageDisplayed=yes) (readonly)>
    <HiddenControl(HWAI=JS!ajax=yes) (disabled, readonly)>
    <HiddenControl(HWAI=JS!js=yes) (disabled, readonly)>
    <HiddenControl(outwardConDetails=) (readonly)>
    <ImageControl(start=Verbindung suchen)>
    <TextControl(REQ0JourneyStopsS0A=255)>
    <TextControl(REQ0JourneyStopsS0G=)>
    <HiddenControl(REQ0JourneyStopsS0ID=) (readonly)>
    <TextControl(REQ0JourneyStopsZ0A=255)>
    <TextControl(REQ0JourneyStopsZ0G=)>
    <HiddenControl(REQ0JourneyStopsZ0ID=) (readonly)>
    <RadioControl(journey_mode=[*single, return])>
    <TextControl(REQ0JourneyDate=17/01/2012)>
    <SelectControl(REQ0JourneyTime=[*0, 00, 9, 14, 18])>
    <HiddenControl(REQ0HafasPeriodToSearch=1440) (readonly)>
    <HiddenControl(REQ0HafasPeriodSearch=2) (readonly)>
    <HiddenControl(REQ0HafasSearchForw=1) (readonly)>
    <CheckboxControl(special_search_both=[1])>
    <TextControl(REQ1JourneyDate=)>
    <SelectControl(REQ1JourneyTime=[*0, 00, 9, 14, 18])>
    <HiddenControl(REQ1HafasPeriodToSearch=1440) (readonly)>
    <HiddenControl(REQ1HafasPeriodSearch=2) (readonly)>
    <HiddenControl(REQ1HafasSearchForw=1) (readonly)>
    <SubmitControl(start=Go) (readonly)>
    <SubmitControl(start=Go) (readonly)>>

Edit:

Oh, I was wrong... it's not a .exe file at all. I downloaded it and opened with a text editor, it's nothing but a .html file! It also works for br = mechanize.Browser()

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文