如何在Java程序中使用Sqoop?
我知道如何通过命令行使用 sqoop 。 但不知道如何使用java程序调用sqoop命令。 谁能提供一些代码视图吗?
I know how to use sqoop through command line.
But dont know how to call sqoop command using java programs .
Can anyone give some code view?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您可以通过在类路径中包含 sqoop jar 并调用
Sqoop.runTool()
方法,从 java 代码内部运行 sqoop。您必须以编程方式创建 sqoop 所需的参数,就好像它是命令行一样(例如--connect
等)。请注意以下事项:
Sqoop.Main()
相比,Sqoop.runTool()
的优点是runTool()
返回以下错误代码:执行。希望有帮助。
RL
You can run sqoop from inside your java code by including the sqoop jar in your classpath and calling the
Sqoop.runTool()
method. You would have to create the required parameters to sqoop programmatically as if it were the command line (e.g.--connect
etc.).Please pay attention to the following:
Sqoop.runTool()
as opposed toSqoop.Main()
is the fact thatrunTool()
return the error code of the execution.Hope that helps.
RL
下面是在 Java 程序中使用 sqoop 将数据从 MySQL 导入到 HDFS/HBase 的示例代码。确保类路径中有 sqoop jar:
按照 Harel 的建议,我们可以使用 run() 方法的输出来进行错误处理。希望这有帮助。
Find below a sample code for using sqoop in Java Program for importing data from MySQL to HDFS/HBase. Make sure you have sqoop jar in your classpath:
As suggested by Harel, we can use the output of the run() method for error handling. Hoping this helps.
有一个技巧对我来说非常有效。通过ssh,可以直接执行Sqoop命令。只是你必须使用的是一个 SSH Java 库,
它独立于 Java。您只需包含要执行导入的远程系统中安装的任何 SSH 库和 sqoop。现在通过 ssh 连接到系统并执行将数据从 MySQL 导出到 hive 的命令。
你必须遵循这个步骤。
下载 sshxcute Java 库:https://code.google.com/p/sshxcute/
并将其添加到您的java项目的构建路径中,其中包含以下Java代码
There is a trick which worked out for me pretty well. Via ssh, you can execute the Sqoop command directly. Just you have to use is an SSH Java Library
This is independent of Java. You just need to include any SSH library and sqoop installed in the remote system you want to perform the import. Now connect to the system via ssh and execute the commands which will export data from MySQL to hive.
You have to follow this step.
Download sshxcute java library: https://code.google.com/p/sshxcute/
and Add it to the build path of your java project which contains the following Java code
如果您知道可执行文件的位置和命令行参数,则可以使用 ProcessBuilder,然后可以运行一个单独的 Process,Java 可以监视该进程的完成情况和返回代码。
If you know the location of the executable and the command line arguments you can use a
ProcessBuilder
, this can then be run a separateProcess
that Java can monitor for completion and return code.请遵循 vikas 给出的对我有用的代码,并将这些 jar 文件包含在类路径中并导入这些包
import com.cloudera.sqoop.SqoopOptions;
导入com.cloudera.sqoop.tool.ImportTool;
参考库
JRE系统库
1.resources.jar jdk/jre/lib
2.rt.jar jdk/jre/lib
3.jsse.jar jdk/jre/lib
4. jce.jar jdk/jre/lib
5. 字符集、jar jdk/jre/lib
6. jfr.jar jdk/jre/lib
7. dnsns.jar jdk/jre/lib/ext
8. sunec.jar jdk/jre/lib/ext
9. zipfs.jar jdk/jre/lib/ext
10. sunpkcs11.jar jdk/jre/lib/ext
11.localedata.jar jdk/jre/lib/ext
12. sunjce_provider.jar jdk/jre/lib/ext
如果您的 eclipse 项目使用 JDK1.6,并且您添加的库是 JDK1.7,则有时您会收到错误,在这种情况下,在 eclipse 中创建项目时配置 JRE。
Vikas 如果我想将导入的文件放入 hive 中,我应该使用 options.parameter ("--hive-import") 吗?
Please follow the code given by vikas it worked for me and include these jar files in classpath and import these packages
import com.cloudera.sqoop.SqoopOptions;
import com.cloudera.sqoop.tool.ImportTool;
Ref Libraries
JRE system library
1.resources.jar jdk/jre/lib
2.rt.jar jdk/jre/lib
3. jsse.jar jdk/jre/lib
4. jce.jar jdk/jre/lib
5. charsets,jar jdk/jre/lib
6. jfr.jar jdk/jre/lib
7. dnsns.jar jdk/jre/lib/ext
8. sunec.jar jdk/jre/lib/ext
9. zipfs.jar jdk/jre/lib/ext
10. sunpkcs11.jar jdk/jre/lib/ext
11. localedata.jar jdk/jre/lib/ext
12. sunjce_provider.jar jdk/jre/lib/ext
Sometimes u get error if your eclipse project is using JDK1.6 and the libraries you add are JDK1.7 for this case configure JRE while creating project in eclipse.
Vikas if i want to put the imported files into hive should i use options.parameter ("--hive-import") ?