Spark无法检测ES版本-AWS OpenSearch

发布于 2025-02-08 15:51:35 字数 1360 浏览 0 评论 0原文

我正在尝试从AWS OpenSearch域读取我的数据并获得此错误：“无法检测ES版本 - 通常，如果网络/Elasticsearch群集无法访问或针对WAN/Cloud实例，则在没有正确的设置的情况下进行'ES.nodes时会发生这种情况。。

当我在Elasticsearch（版本7.10）下连接到域时，一切都很好。

我的示例Scala代码：

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.apache.spark.rdd.RDD
import org.apache.spark.sql._
import org.apache.spark.sql.types.{StructField, StructType}
import org.elasticsearch.spark._
import org.apache.spark.sql.types.StringType
import org.apache.spark.sql.types.{MapType, StringType}
import org.apache.spark.sql.functions.{from_json,col}

object SparkContextApp {
    def main(args: Array[String]): Unit = {
      val appName = "App"
      val master = "local[*]"
      val conf = new SparkConf().setAppName(appName)
        .setMaster(master)
        .set("es.nodes", "https://*************************.us-east-1.es.amazonaws.com")
        .set("es.port", "***")
        .set("es.http.timeout", "5m")
        .set("es.nodes.wan.only", "true")
        .set("es.net.ssl", "true")
        .set("es.net.http.auth.user", "********")
        .set("es.net.http.auth.pass", "********")
      val sc = new SparkContext(conf)
      val data = sc.esRDD("***/***")
     }
  }

库的依赖性： library Dippedies +=“ org.elasticsearch”％“ Elasticsearch-Spark-30_2.12”％“ 8.2.3”

原文

I'm trying read my data from AWS OpenSearch domain and getting this error: "Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only".

When I'm connecting to a domain under ElasticSearch (version 7.10), everything is fine.

My sample Scala code:

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.apache.spark.rdd.RDD
import org.apache.spark.sql._
import org.apache.spark.sql.types.{StructField, StructType}
import org.elasticsearch.spark._
import org.apache.spark.sql.types.StringType
import org.apache.spark.sql.types.{MapType, StringType}
import org.apache.spark.sql.functions.{from_json,col}

object SparkContextApp {
    def main(args: Array[String]): Unit = {
      val appName = "App"
      val master = "local[*]"
      val conf = new SparkConf().setAppName(appName)
        .setMaster(master)
        .set("es.nodes", "https://*************************.us-east-1.es.amazonaws.com")
        .set("es.port", "***")
        .set("es.http.timeout", "5m")
        .set("es.nodes.wan.only", "true")
        .set("es.net.ssl", "true")
        .set("es.net.http.auth.user", "********")
        .set("es.net.http.auth.pass", "********")
      val sc = new SparkContext(conf)
      val data = sc.esRDD("***/***")
     }
  }

The library dependencies:
libraryDependencies += "org.elasticsearch" % "elasticsearch-spark-30_2.12" % "8.2.3"

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

守不住的情 2025-02-15 15:51:35

您需要配置OpenSearch以在兼容模式下运行。设置/创建期间有一个标志，可以实现这一目标。

这也可以通过API调用来完成：

PUT /_cluster/settings
{
  "persistent" : {
    "compatibility.override_main_response_version" : true
  }
}

兼容模式只需告诉OpenSearch将其elasticsearch版本编号报告为7.10而不是'newer'openSearch 1.2.0 7.10。 >版本。

这将允许您的火花连接器正确识别版本编号并成功连接。

You need to configure your OpenSearch to be running in compatibility mode. There is a flag during setup/creation which will enable this.

This can also be done via an API call:

PUT /_cluster/settings
{
  "persistent" : {
    "compatibility.override_main_response_version" : true
  }
}

Compatibility mode simply tells OpenSearch to report its elasticsearch version number back as being that of 7.10 instead of the 'newer' Opensearch 1.2.0 version.

This will allow your spark connector to correctly identify the version number and connect successfully.

回复收藏 0 原文

~没有更多了~