从命令行(CLASSPATH)运行 Mahout
在Windows下使用Maven成功编译了Mahout。
我正在尝试从命令行运行示例之一,但我不明白我做错了什么。看起来像是 CLASSPATH 问题。
假设我想运行 GroupLensRecommenderEvaluatorRunner 示例。我转到其中包含 GroupLensRecommenderEvaluatorRunner.class 文件的文件夹并执行:
java -cp C:/mahout/core/target/classes;.
org.apache.mahout.cf.taste.example.grouplens.GroupLensRecommenderEvaluatorRunner
它为我提供了 GroupLensRecommenderEvaluatorRunner 的 NoClassDefFoundError 异常 班级。
-cp 的路径是否错误?
顺便说一句,对于那些不熟悉 mahout 的人来说,
org.apache.mahout.cf.taste.example.grouplens
它是 GroupLensRecommenderEvaluatorRunner 类的包。 javadoc
谢谢大家。
ps - 在提出这个问题之前,我首先查看了 CLASSPATH 上以前的 stackoverflow 问题,并遵循了给定的解决方案。
Complied Mahout successfully under Windows using Maven.
I'm trying to run one of the examples from the command line and I don't get what I am doing wrong. Seems like a CLASSPATH problem.
Let's say I want to run the GroupLensRecommenderEvaluatorRunner example. I go to the folder with the GroupLensRecommenderEvaluatorRunner.class file in it and execute:
java -cp C:/mahout/core/target/classes;.
org.apache.mahout.cf.taste.example.grouplens.GroupLensRecommenderEvaluatorRunner
It gives me the NoClassDefFoundError exception for the GroupLensRecommenderEvaluatorRunner
class.
Is the path for -cp wrong?
btw, for those who aren't familiar with mahout,
org.apache.mahout.cf.taste.example.grouplens
is the package of the GroupLensRecommenderEvaluatorRunner class.
javadoc
thanks guys.
p.s - I first looked on previous stackoverflow questions on CLASSPATH and followed the given solutions, before asking this question.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
最好在[电子邮件受保护]中询问此问题。
您的类路径在 Mahout 的示例模块中缺少已编译的代码,该模块是此类所在的位置。
更好的是,看看这个演练:https://cwiki.apache。 org/confluence/display/MAHOUT/Recommender+文档
This is better asked at [email protected].
Your classpath is missing compiled code in Mahout's examples module, which is where this class lives.
Better yet, have a look at this walkthrough: https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation
如果您将
$MAHOUT_HOME/examples/target/classes
放在 javaCLASSPATH
中(正如 Sean 提到的),这在本地运行时将起作用,但您可能必须尝试下面的方法用于hadoop集群部署。我发现下面的文章非常有启发性地介绍了如何在 mahout/hadoop 的各种配置中获取正确的类。
http://www.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/
mahout脚本不接受hadoop作业参数(如 --libJar)在所有情况下都适用,尽管我希望将来会这样做,特别是当作业的参数是类名时(例如 seq2sparse)。
我要做的就是将自定义 jar 复制到主节点上的
$HADOOP_HOME/lib
中。显然符号链接不起作用,看来您必须将所需的每个 jar 复制到该目录。然后不要忘记停止并启动hadoop,因为正如cloudera参考所说,它在启动时打包库。
If you put
$MAHOUT_HOME/examples/target/classes
is in the javaCLASSPATH
(as Sean mentions) this will work when running locally but you'll probably have to try the method below for a hadoop cluster deployment.I found the following post very illuminating about how get the right classes in various configurations of mahout/hadoop.
http://www.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/
The mahout script does not accept hadoop job parameters (like --libJar) in all cases although I hope it does in the future, especially where a parameter to the job is a classname (seq2sparse for instance).
What I had to do was copy my custom jar into
$HADOOP_HOME/lib
on the master node. Evidently a symlink does not work, it appears you have to copy each jar you want to the directory.Then don't forget to stop and start hadoop because as the cloudera reference says it packages the libs at startup.
我所做的是使用我的 jar 和所有 mahout jar 文件设置 HADOOP_CLASSPATH,如下所示。
导出 HADOOP_CLASSPATH=/home/xxx/my.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-core-0.7-cdh4.3.0.jar :/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-core-0.7-cdh4.3.0-job.jar:/opt/cloudera/parcels/CDH -4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-examples-0.7-cdh4.3.0.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0。 p0.22/lib/mahout/mahout-examples-0.7-cdh4.3.0-job.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout -integration-0.7-cdh4.3.0.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-math-0.7-cdh4.3.0.jar
然后我能够运行 hadoop com.mycompany.mahout.CSVtoVector iris/nb/iris1.csv iris/nb/data/iris.seq
所以你必须在 HADOOP_CLASSPATH 中包含所有 jar 和 mahout jar,然后你就可以运行你的类
hadoop
What I did is to set the HADOOP_CLASSPATH with my jar and all the mahout jar files as shown below.
export HADOOP_CLASSPATH=/home/xxx/my.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-core-0.7-cdh4.3.0.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-core-0.7-cdh4.3.0-job.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-examples-0.7-cdh4.3.0.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-examples-0.7-cdh4.3.0-job.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-integration-0.7-cdh4.3.0.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-math-0.7-cdh4.3.0.jar
Then I was able to run hadoop com.mycompany.mahout.CSVtoVector iris/nb/iris1.csv iris/nb/data/iris.seq
So you have to include all your jars and the mahout jar in the HADOOP_CLASSPATH and then you can just run your class with
hadoop <classname>