AWS GLUE 3.0 -PYSPARK的铸造问题
我正在使用胶3.0
data = [("Java", "6241499.16943521594684385382059800664452")]
rdd = spark.sparkContext.parallelize(data)
df = rdd.toDF()
df.show()
df.select(f.col("_2").cast("decimal(15,2)")).show()
我在本地获得以下结果
+----+--------------------+
| _1| _2|
+----+--------------------+
|Java|6241499.169435215...|
+----+--------------------+
+----+
| _2|
+----+
|null|
+----+
,并使用pyspark =“ == 3.2.1”
没有问题将字符串施放为DECIMAL( )
但是胶水
作业无法做到
I'm using Glue 3.0
data = [("Java", "6241499.16943521594684385382059800664452")]
rdd = spark.sparkContext.parallelize(data)
df = rdd.toDF()
df.show()
df.select(f.col("_2").cast("decimal(15,2)")).show()
I get the following result
+----+--------------------+
| _1| _2|
+----+--------------------+
|Java|6241499.169435215...|
+----+--------------------+
+----+
| _2|
+----+
|null|
+----+
locally with pyspark= "==3.2.1"
there is no issue to cast the string to decimal()
but the Glue
job is not able to do so
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
问题是 aws胶水!为了遇到这一点,我曾经在执行
cast
输出
注:显然!我们可以使用 spark sql functions
The problem is with AWS Glue ! in order to encounter this, I used to convert my string before doing the
cast
Output
Note: Obviously ! we can use Spark SQL Functions instead