成功将 MONGODB 表爬行到 AWS 数据目录后,无法创建动态框架

发布于 2025-01-19 03:43:54 字数 1794 浏览 5 评论 0原文

我成功地创建了一个MongoDB连接,我的连接测试成功了,并且能够使用轨道在胶水数据目录中创建元数据。但是,当我在下面使用下面的位置,将我的mongoDB数据库名称和收集名称中的添加到附加_Options参数中,我会发现一个错误:

data_catalog_database = 'tinkerbell'data_catalog_table = 'tinkerbell_funds'glueContext.create_dynamic_frame_from_catalog(database = data_catalog_database,table_name = data_catalog_table,additional_options = {"database":"tinkerbell","collection":"funds"})

以下是错误:遇到错误:调用o177.getdynamicframe时发生错误。 :java.lang.nosuchmethoderror:com.mongodb.internal.connection.defaultclusterablesermableserverfactory。连接/ConnectionPoolSettings; lcom/mongodb/connection/streamFactory; lcom/mongodb/connection/streamFactory; lcom/mongodb/mongocredential; lcom/mongodb/eventslistener; ljava; ljava; ljava/lang/lang/string; lcom/mongodb/mongodb/mongongodb/mongongodb/mongodrivit; ljodriverin/ljova; ljova;列表;)v

当我没有其他参数时使用它时,

glueContext.create_dynamic_frame_from_catalog(database = data_catalog_database,table_name = data_catalog_table)

我会收到以下错误: 遇到错误:缺少集合名称。通过'spark.mongodb.input.uri'或'spark.mongodb.input.collection'属性追溯(最新呼叫上次)设置:文件“/home/glue_user/aws-glue-libs/pyglue.zip/awsglue/awsglue/ context.py”,第179行,在create_dynamic_frame_frame_from_catalog返回source.getFrame(** kwargs)文件“/home/glue_user/aws-glue-libs/pyglue.zip/awsglue/awsglue/data_source.py” self._jsource.getDynamicFrame()文件“/home/glue_user/spark/python/lib/lib/py4j-0.10.9-src.zip.zip/py4j/java_gateway.py”,第1305行,在呼叫答案,self.gateway_client,self.gateway_client,self中。 target_id,self.name)文件“/home/glue_user/spark/python/pyspark/pyspark/sql/utils.py”,第117行,在deco上升起,从none pyspark.sql.util.utils.utils.illegalargumentexception转换了转换。通过'spark.mongodb.input.uri'或'spark.mongodb.input.collection'属性设置

,请有人请帮助我正确传递这些参数吗?

上面已经解释了我尝试的内容,但是我期望使用目录表创建动态框架。

I created a mongodb connection successfully, my connection tests successfully and was able to use a Crawler to create metadata in the Glue Data Catalog. However, when i use below where i am adding my mongodb database name and collection name in additional_options parameter i get an error:

data_catalog_database = 'tinkerbell'data_catalog_table = 'tinkerbell_funds'glueContext.create_dynamic_frame_from_catalog(database = data_catalog_database,table_name = data_catalog_table,additional_options = {"database":"tinkerbell","collection":"funds"})

following is the error: An error was encountered: An error occurred while calling o177.getDynamicFrame. : java.lang.NoSuchMethodError: com.mongodb.internal.connection.DefaultClusterableServerFactory.<init>(Lcom/mongodb/connection/ClusterId;Lcom/mongodb/connection/ClusterSettings;Lcom/mongodb/connection/ServerSettings;Lcom/mongodb/connection/ConnectionPoolSettings;Lcom/mongodb/connection/StreamFactory;Lcom/mongodb/connection/StreamFactory;Lcom/mongodb/MongoCredential;Lcom/mongodb/event/CommandListener;Ljava/lang/String;Lcom/mongodb/MongoDriverInformation;Ljava/util/List;)V

When I use it without additional parameters

glueContext.create_dynamic_frame_from_catalog(database = data_catalog_database,table_name = data_catalog_table)

I get following error:
An error was encountered: Missing collection name. Set via the 'spark.mongodb.input.uri' or 'spark.mongodb.input.collection' property Traceback (most recent call last): File "/home/glue_user/aws-glue-libs/PyGlue.zip/awsglue/context.py", line 179, in create_dynamic_frame_from_catalog return source.getFrame(**kwargs) File "/home/glue_user/aws-glue-libs/PyGlue.zip/awsglue/data_source.py", line 36, in getFrame jframe = self._jsource.getDynamicFrame() File "/home/glue_user/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in call answer, self.gateway_client, self.target_id, self.name) File "/home/glue_user/spark/python/pyspark/sql/utils.py", line 117, in deco raise converted from None pyspark.sql.utils.IllegalArgumentException: Missing collection name. Set via the 'spark.mongodb.input.uri' or 'spark.mongodb.input.collection' property

Can someone please help me pass these parameters correctly?

Have explained above on what I tried but what I was expecting the dynamic frame to be created using the catalog table.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

赤濁 2025-01-26 03:43:54

您会遇到错误,因为Mongo期望与Spark建立联系,并且需要输入和输出属性。

请参阅下面的链接 -
https .com/docs/spark-connector/master/python-api/#std-label-pyspark-shell

You are getting that error as mongo is expecting a connection with spark and need the input and output property.

Please refer to below link-
https://www.mongodb.com/docs/spark-connector/master/python-api/#std-label-pyspark-shell

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文