df.show()在pyspark返回“未授权的exception:用户my_user在< table; table system.size_estimates>或任何父母
我正在尝试从Cassandra表中读取记录,
此代码可以正常工作:
df = spark.read \
.format("org.apache.spark.sql.cassandra") \
.option("spark.cassandra.connection.host", "my_host") \
.option("spark.cassandra.connection.port", "9042") \
.option("spark.cassandra.auth.username", "my_user") \
.option("spark.cassandra.auth.password", "my_pass") \
.option("keyspace", "my_keyspace") \
.option("table", "my_table") \
.load()
但是当我尝试显示记录时,
df.show(3)
我会得到此例外,
com.datastax.oss.driver.api.core.servererrors.UnauthorizedException: User my_user has no SELECT permission on <table system.size_estimates> or any of its parents
关键是我只能获得my_keyspace的所有权限。
但是,我成功地将CQLSH连接到同一Cassandra主机:具有相同用户/通行证的端口,并在my_keyspace中执行任何操作。
请建议火花代码有什么问题以及如何在这种情况下采取行动?
I'm trying to read records from Cassandra table
this code works fine:
df = spark.read \
.format("org.apache.spark.sql.cassandra") \
.option("spark.cassandra.connection.host", "my_host") \
.option("spark.cassandra.connection.port", "9042") \
.option("spark.cassandra.auth.username", "my_user") \
.option("spark.cassandra.auth.password", "my_pass") \
.option("keyspace", "my_keyspace") \
.option("table", "my_table") \
.load()
but when i try to show records
df.show(3)
i get this exception
com.datastax.oss.driver.api.core.servererrors.UnauthorizedException: User my_user has no SELECT permission on <table system.size_estimates> or any of its parents
The point is i have all permissions to my_keyspace only.
But i successfully connect with cqlsh to same cassandra host:port with same user/pass and do whatever in my_keyspace.
Please advice what's wrong with spark code and how to act in such situation?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
Spark Cassandra连接器使用存储在
system.size_estimates
中的值估算Cassandra表的大小。连接器需要估算表尺寸,以计算火花隔板的数量。请参阅我在这篇文章中的答案 。如果您,已自动给出身份验证的用户/角色,请阅读对某些系统表的访问:
但是您需要明确授权您的火花用户,因此它可以访问
size_estimates
with:请注意,请注意,该角色只需要读取访问(
)选择
权限)到表。干杯!The Spark Cassandra connector estimates the size of the Cassandra tables using the values stored in
system.size_estimates
. The connector needs an estimate of the table size in order to calculate the number of Spark partitions. See my answer in this post for details.If you've enabled the authorizer in Cassandra, authenticated users/roles are automatically given read access to some system tables:
But you will need to explicitly authorize your Spark user so it can access the
size_estimates
table with:Note that the role only needs read access (
SELECT
permission) to the table. Cheers!您需要授予该用户的读取访问
system.size_estimation
You need to grant read access to
system.size_estimation
for that user