从pyspark中的列名列表中得出structtype架构

发布于 2025-02-03 08:17:53 字数 668 浏览 4 评论 0原文

在Pyspark中，我不想进行编码定义定义，我想从下面的变量中得出架构。

mySchema=[("id","IntegerType()", True),
          ("name","StringType()", True),
          ("InsertDate","TimestampType()", True)
         ]

result = mySchema.map(lambda l: StructField(l[0],l[1],l[2]))

如何实现此逻辑以从myschema生成structTypeschema？

预期输出：

structTypeSchema = StructType(fields=[
                                      StructField("id", IntegerType(), True),
                                      StructField("name", StringType(), True), 
                                      StructField("InsertDate",TimestampType(), True)])

原文

In PySpark, I don't want to hardcode the schema definition, I want to derive the schema from below variable.

mySchema=[("id","IntegerType()", True),
          ("name","StringType()", True),
          ("InsertDate","TimestampType()", True)
         ]

result = mySchema.map(lambda l: StructField(l[0],l[1],l[2]))

How do I achieve this logic to generate the structTypeSchema from mySchema?

Expected output:

structTypeSchema = StructType(fields=[
                                      StructField("id", IntegerType(), True),
                                      StructField("name", StringType(), True), 
                                      StructField("InsertDate",TimestampType(), True)])

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

吃素的狼 2025-02-10 08:17:53

continue

You can try something along these lines:

from pyspark.sql import types as T

structTypeSchema = T.StructType(
    [T.StructField(f[0], eval(f'T.{f[1]}'), f[2]) for f in mySchema]
)

from pyspark.sql.types import *
                                       
structTypeSchema = StructType(
    [StructField(f[0], eval(f[1]), f[2]) for f in mySchema]
)

回复收藏 0 原文

~没有更多了~