使用Java连接到EMR HBase

发布于 2025-02-06 03:25:14 字数 4745 浏览 2 评论 0原文

我正在尝试从EMR 5.35(Hadoop 2.10,Spark 2.4.8,HBase 1.4.13)上运行的SPARK程序连接到HBase。 当不尝试连接到HBASE时,我的Spark程序运行完美。

但是,当我添加HBase代码时,Spark程序在创建配置时就会死亡:

 conf = HBaseConfiguration.create();

 for (Iterator<Map.Entry<String, String>> it = conf.iterator(); it.hasNext(); ) {
     Map.Entry<String, String> e = it.next();
     System.out.println(e);
 }

 connection = ConnectionFactory.createConnection(conf);
 admin = connection.getAdmin();

我尝试添加资源:

    conf = HBaseConfiguration.create();
    conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
    conf.addResource(new Path("/etc/hbase/conf/hbase-site.xml"));

没有成功。

我已经在hbaseconfiguration.create()之后评论了所有线条,但是该程序还是死了。我相信问题就在那里。我没有有用的堆栈跟踪。驾驶员撞上线时立即死亡。

POM:

<properties>
    <spark.version>2.4.8</spark.version>
    <hbase.version>1.4.13</hbase.version>
    <hadoop.version>2.10.1</hadoop.version>
    <jackson.version>2.13.2</jackson.version>
    <!-- Maven stuff -->
    <java.build.version>1.8</java.build.version>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>

<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>software.amazon.awssdk</groupId>
            <artifactId>bom</artifactId>
            <version>2.17.103</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
        <dependency>
            <groupId>io.netty</groupId>
            <artifactId>netty-all</artifactId>
            <version>4.1.77.Final</version>
        </dependency>
        <dependency>
            <groupId>io.netty</groupId>
            <artifactId>netty</artifactId>
            <version>3.9.9.Final</version>
        </dependency>
    </dependencies>
</dependencyManagement>

<dependencies>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.12</artifactId>
        <version>${spark.version}</version>
        <scope>provided</scope>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.12</artifactId>
        <version>${spark.version}</version>
        <scope>provided</scope>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-aws</artifactId>
        <version>${hadoop.version}</version>
        <scope>provided</scope>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-common</artifactId>
        <version>${hadoop.version}</version>
        <scope>provided</scope>
    </dependency>
    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-client</artifactId>
        <version>${hbase.version}</version>
        <scope>provided</scope>
    </dependency>

    <!--  AWS -->
    <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>s3</artifactId>
    </dependency>
    <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>athena</artifactId>
    </dependency>
    <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>auth</artifactId>
    </dependency>
    <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>opensearch</artifactId>
    </dependency>
    <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>apache-client</artifactId>
    </dependency>

    <dependency>
        <groupId>org.apache.httpcomponents</groupId>
        <artifactId>httpcore</artifactId>
        <version>4.4.15</version>
    </dependency>

    <dependency>
        <groupId>org.elasticsearch.client</groupId>
        <artifactId>elasticsearch-rest-client</artifactId>
        <version>5.6.16</version>
    </dependency>

I am trying to connect to HBase from inside a Spark program running on EMR 5.35 (Hadoop 2.10, Spark 2.4.8, HBase 1.4.13)
When not trying to connect to HBase, my Spark programs run perfectly.

However, as I add my HBase code, the Spark program dies when creating the configuration:

 conf = HBaseConfiguration.create();

 for (Iterator<Map.Entry<String, String>> it = conf.iterator(); it.hasNext(); ) {
     Map.Entry<String, String> e = it.next();
     System.out.println(e);
 }

 connection = ConnectionFactory.createConnection(conf);
 admin = connection.getAdmin();

I tried adding resources:

    conf = HBaseConfiguration.create();
    conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
    conf.addResource(new Path("/etc/hbase/conf/hbase-site.xml"));

without success.

I have commented out all lines after HBaseconfiguration.create() but the program dies anyway. I believe the problem lies there. I get no useful stack trace. The driver dies immediately upon hitting the line.

The POM:

<properties>
    <spark.version>2.4.8</spark.version>
    <hbase.version>1.4.13</hbase.version>
    <hadoop.version>2.10.1</hadoop.version>
    <jackson.version>2.13.2</jackson.version>
    <!-- Maven stuff -->
    <java.build.version>1.8</java.build.version>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>

<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>software.amazon.awssdk</groupId>
            <artifactId>bom</artifactId>
            <version>2.17.103</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
        <dependency>
            <groupId>io.netty</groupId>
            <artifactId>netty-all</artifactId>
            <version>4.1.77.Final</version>
        </dependency>
        <dependency>
            <groupId>io.netty</groupId>
            <artifactId>netty</artifactId>
            <version>3.9.9.Final</version>
        </dependency>
    </dependencies>
</dependencyManagement>

<dependencies>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.12</artifactId>
        <version>${spark.version}</version>
        <scope>provided</scope>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.12</artifactId>
        <version>${spark.version}</version>
        <scope>provided</scope>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-aws</artifactId>
        <version>${hadoop.version}</version>
        <scope>provided</scope>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-common</artifactId>
        <version>${hadoop.version}</version>
        <scope>provided</scope>
    </dependency>
    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-client</artifactId>
        <version>${hbase.version}</version>
        <scope>provided</scope>
    </dependency>

    <!--  AWS -->
    <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>s3</artifactId>
    </dependency>
    <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>athena</artifactId>
    </dependency>
    <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>auth</artifactId>
    </dependency>
    <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>opensearch</artifactId>
    </dependency>
    <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>apache-client</artifactId>
    </dependency>

    <dependency>
        <groupId>org.apache.httpcomponents</groupId>
        <artifactId>httpcore</artifactId>
        <version>4.4.15</version>
    </dependency>

    <dependency>
        <groupId>org.elasticsearch.client</groupId>
        <artifactId>elasticsearch-rest-client</artifactId>
        <version>5.6.16</version>
    </dependency>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

深海蓝天 2025-02-13 03:25:15

通过删除“提供”的范围解决了问题,因此包括对Uber -Jar的依赖性,加上添加HBase POM依赖性(不是100%确定这是否严格必要)

    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-client</artifactId>
        <version>${hbase.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase</artifactId>
        <version>${hbase.version}</version>
        <type>pom</type>
    </dependency>

Solved the issue by removing the "provided" scope -and thus including the dependency on the uber-jar, plus adding the Hbase pom dependency (not 100% sure whether this is strictly necessary)

    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-client</artifactId>
        <version>${hbase.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase</artifactId>
        <version>${hbase.version}</version>
        <type>pom</type>
    </dependency>
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文