使用 Jython 运行 HtmlUnit - 命令行启动时出现问题

发布于 2024-12-09 20:17:41 字数 681 浏览 3 评论 0原文

我尝试按照本教程使用 Jython 运行 HtmlUnit:

http://blog .databigbang.com/web-scraping-ajax-and-javascript-sites/

但它对我不起作用。我无法导入 com.gargoylesoftvare 包,HtmlUnit 文件夹中只有一些 HTML 文件,我需要以某种方式导入它们?

该教程说要像这样运行 python 脚本:

/opt/jython/jython -J-classpath "htmlunit-2.8/lib/*" gartner.py

我尝试运行:

java -jar /Users/adam/jython/jython.jar -J-classpath "htmlunit-2.8/lib/*" gartner.py

我的问题是我收到“未知选项:J-classpath”。但 Jython.org 上甚至没有关于 -J-classpath 参数的信息。如果有任何建议,我将非常高兴。我在 Snow Leopard 上运行 jython 独立版本 2.5.2

I tried to run HtmlUnit with Jython following this tutorial:

http://blog.databigbang.com/web-scraping-ajax-and-javascript-sites/

but it does not work for me. I am unable to import the com.gargoylesoftvare packages, there are only some HTML files in HtmlUnit folder, which I need to import somehow?

The tutorial says to run python script like this:

/opt/jython/jython -J-classpath "htmlunit-2.8/lib/*" gartner.py

and I try to run:

java -jar /Users/adam/jython/jython.jar -J-classpath "htmlunit-2.8/lib/*" gartner.py

My problem is I am getting an "Unknown option: J-classpath". But there is not even word about -J-classpath parameter on Jython.org. I would be VERY glad for any advice. I am running jython standalone v. 2.5.2 on Snow Leopard

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

征棹 2024-12-16 20:17:41

您的整个命令行正在由 java 命令处理(理应如此),并且 -J-classpath 确实不是 java< 的有效命令行选项/代码>。您确实应该尝试遵循本教程的确切步骤,因为您缺少几个重要的步骤(以及弥补您自己的步骤)。

Your entire command line is being processed by the java command (as it should), and -J-classpath is indeed not a valid command line option for java. You should really try to follow the exact steps of the tutorial, because you are missing several important steps (and kind of making up your own steps).

分分钟 2024-12-16 20:17:41

如果脚本使用脚本运行所需的 jar 的 sys.path.append 将完整 url 附加到 python 路径,则可以将 Jython 脚本运行为: jython myscript.py 。

这是我正在编写的当前脚本。

#!/opt/jython/jython
'''
Created on Dec 7, 2011
@author: chris
'''
import sys, os
from time import sleep

jarpath = '/usr/share/java/htmlunit/' #path the jar files to import
jars = ['apache-mime4j-0.6.jar','commons-codec-1.4.jar',
    'commons-collections-3.2.1.jar','commons-io-1.4.jar',
    'commons-lang-2.4.jar','commons-logging-1.1.1.jar',
    'cssparser-0.9.5.jar','htmlunit-2.8.jar',
    'htmlunit-core-js-2.8.jar','httpclient-4.0.1.jar',
    'httpcore-4.0.1.jar','httpmime-4.0.1.jar',
    'nekohtml-1.9.14.jar','sac-1.3.jar',
    'serializer-2.7.1.jar','xalan-2.7.1.jar',
    'xercesImpl-2.9.1.jar','xml-apis-1.3.04.jar'] #a list of jars

def loadjars(): #appends jars to jython path
    for jar in jars:
        print(jarpath+jar+'\n')
        container = jarpath+jar
        sys.path.append(container)

loadjars()

import com.gargoylesoftware.htmlunit.WebClient as WebClient
webclient = WebClient()   

def gotopage():
    print('hello, I will visit Google')
    url = 'http://google.com'
    page = webclient.getPage(url)
    print(page)    

if __name__ == "__main__":
    gotopage()

It is possible to run a Jython script as: jython myscript.py if the script appends the full url to the python path using sys.path.append of the jars that a script will require to run.

Here is a current script I'm working on.

#!/opt/jython/jython
'''
Created on Dec 7, 2011
@author: chris
'''
import sys, os
from time import sleep

jarpath = '/usr/share/java/htmlunit/' #path the jar files to import
jars = ['apache-mime4j-0.6.jar','commons-codec-1.4.jar',
    'commons-collections-3.2.1.jar','commons-io-1.4.jar',
    'commons-lang-2.4.jar','commons-logging-1.1.1.jar',
    'cssparser-0.9.5.jar','htmlunit-2.8.jar',
    'htmlunit-core-js-2.8.jar','httpclient-4.0.1.jar',
    'httpcore-4.0.1.jar','httpmime-4.0.1.jar',
    'nekohtml-1.9.14.jar','sac-1.3.jar',
    'serializer-2.7.1.jar','xalan-2.7.1.jar',
    'xercesImpl-2.9.1.jar','xml-apis-1.3.04.jar'] #a list of jars

def loadjars(): #appends jars to jython path
    for jar in jars:
        print(jarpath+jar+'\n')
        container = jarpath+jar
        sys.path.append(container)

loadjars()

import com.gargoylesoftware.htmlunit.WebClient as WebClient
webclient = WebClient()   

def gotopage():
    print('hello, I will visit Google')
    url = 'http://google.com'
    page = webclient.getPage(url)
    print(page)    

if __name__ == "__main__":
    gotopage()
哭泣的笑容 2024-12-16 20:17:41

我以前遇到过这样的错误,并执行这些步骤我成功解决了它。

  1. 下载jython并运行java -jar python-installer-xxx.jar安装jython,然后将jython/bin文件夹放入系统路径,运行jython 在命令行中以确保它没问题。
  2. 在sourceforge中下载htmlunit jar文件,你需要指定它的位置。
  3. 编写 .py 文件并运行

    jython -J-classpath“/Users/crabime/开发文件夹/htmlunit-2.23/lib/*”/Users/crabime/PycharmProjects/scrapimage/crabime/gartner.py

一切都会好的,如果你仍然错过模块未找到,也许你应该检查你的输入命令类型错误。

I have met such error before, and do these steps i solve it successfully.

  1. download jython and run java -jar python-installer-xxx.jarto install jython, then you can put jython/bin folder to your system path, run jython in command line to ensure it's ok.
  2. download htmlunit jar files in sourceforge and you need to specific its location.
  3. write your .py file and run

    jython -J-classpath "/Users/crabime/Development Folder/htmlunit-2.23/lib/*" /Users/crabime/PycharmProjects/scrapimage/crabime/gartner.py

everything will ok,if you still miss module not found, maybe you should check your input command type error.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文