Bash 脚本命令问题
当我在 cygwin 中输入以下命令时,
bin/nutch index crawl/crawldb crawl/linkdb crawl/segment/*
二进制文件可以正常工作。当我将完全相同的行放入 bash 脚本中时:
#!/bin/bash/
bin/nutch index crawl/crawldb crawl/linkdb crawl/segment/*
我收到一条错误消息,指出某些文件不存在。这可能是我正在运行的程序 Nutch 特有的,但我认为它与我在脚本中调用命令的方式有更多关系。关于出了什么问题以及如何解决这个问题有什么想法吗? (是的,我正在使用制表符完成)
编辑:
脚本:
#!/bin/bash
/home/Dan/apache-nutch-1.2/bin/nutch index crawl/indexes crawl/crawldb crawl/linkdb crawl/segments/*
我运行命令:
$ pwd
/home/Dan/apache-nutch-1.2
$ ./nutch.sh
我得到的输出是:
Indexer: starting at 2010-11-29 15:15:44
Indexer: org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/crawl_fetch
Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/crawl_parse
Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/parse_data
Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/parse_text
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:190)
at org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:44)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:201)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
at org.apache.nutch.indexer.Indexer.index(Indexer.java:76)
at org.apache.nutch.indexer.Indexer.run(Indexer.java:97)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.indexer.Indexer.main(Indexer.java:106)
问候, ~DS
I when I type the following command into cygwin:
bin/nutch index crawl/crawldb crawl/linkdb crawl/segment/*
then the binary works fine. When I place the exact same line into my bash script:
#!/bin/bash/
bin/nutch index crawl/crawldb crawl/linkdb crawl/segment/*
I get an error saying some files don't exist. This may be specific to Nutch which is the program I'm running, but I think it has more to do with how I'm calling the command in the script. Any ideas about what's wrong and how to fix this? (yes I'm using tab completion)
EDIT:
Script:
#!/bin/bash
/home/Dan/apache-nutch-1.2/bin/nutch index crawl/indexes crawl/crawldb crawl/linkdb crawl/segments/*
I run the command:
$ pwd
/home/Dan/apache-nutch-1.2
$ ./nutch.sh
The output I'm getting is:
Indexer: starting at 2010-11-29 15:15:44
Indexer: org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/crawl_fetch
Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/crawl_parse
Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/parse_data
Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/parse_text
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:190)
at org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:44)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:201)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
at org.apache.nutch.indexer.Indexer.index(Indexer.java:76)
at org.apache.nutch.indexer.Indexer.run(Indexer.java:97)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.indexer.Indexer.main(Indexer.java:106)
Regards,
~DS
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
有两件事:
#!/bin/bash
。还要仔细检查/bin
中是否有bash
。bin
目录执行 nutch。因此,如果您位于$HOME
中,并假设您有一个路径$HOME/bin/nutch
,那么就可以了。但是,如果您更改为/tmp
,那么它将失败,因为没有/tmp/bin/nutch
这样的路径。您最好首先为 nutch 提供完整的绝对路径名。Two things:
#!/bin/bash
. Also double check there is abash
in/bin
.bin
directory in your currect folder. So if you're in$HOME
, and assuming you've got a path$HOME/bin/nutch
, then you'll be okay. But then if you change to/tmp
, then it'll fail as there's no such path as/tmp/bin/nutch
. You're better off giving the full absolute path name to nutch in the first place.