尝试为DateTime设置Tzinfo参数时输入错误
我正在尝试使用Faker软件包在Pyspark中产生假产生日期。
我的代码如下:
from faker import *
from pyspark.sql.types import *
from pyspark.sql import Row
from datetime import *
fake = Faker("en_GB")
fake.seed_locale("en_GB", 0)
df = spark.createDataFrame([
Row(BIRTH_DT = datetime(2000, 1, 1, 12, 0)),
Row(BIRTH_DT = datetime(2000, 2, 1, 12, 0)),
Row(BIRTH_DT = datetime(2000, 3, 1, 12, 0))
])
class anonymise:
def BIRTH_DT():
def BirthDt_values():
return fake.date_of_birth(datetime.tzinfo == None)
BirthDt_udf = udf(BirthDt_values, TimestampType())
return BirthDt_udf()
df = df \
.withColumn("BIRTH_DT", anonymise.BIRTH_DT())
df.display()
但是我遇到了此错误:
pythonexception:'typeError:tzinfo参数必须不一个或tzinfo子类,而不是类型'bool''
我不明白它如何认为我的参数值是布尔值?我必须不正确地格式化这个,但我不知道该怎么办。任何帮助将不胜感激!
谢谢,
卡罗来纳州
I'm trying to use the faker package to generate fake dates of birth in pyspark.
My code is as below:
from faker import *
from pyspark.sql.types import *
from pyspark.sql import Row
from datetime import *
fake = Faker("en_GB")
fake.seed_locale("en_GB", 0)
df = spark.createDataFrame([
Row(BIRTH_DT = datetime(2000, 1, 1, 12, 0)),
Row(BIRTH_DT = datetime(2000, 2, 1, 12, 0)),
Row(BIRTH_DT = datetime(2000, 3, 1, 12, 0))
])
class anonymise:
def BIRTH_DT():
def BirthDt_values():
return fake.date_of_birth(datetime.tzinfo == None)
BirthDt_udf = udf(BirthDt_values, TimestampType())
return BirthDt_udf()
df = df \
.withColumn("BIRTH_DT", anonymise.BIRTH_DT())
df.display()
However I'm getting this error:
PythonException: 'TypeError: tzinfo argument must be None or of a tzinfo subclass, not type 'bool''
I don't understand how it thinks that my parameter value is a boolean? I must be formatting this incorrectly but I can't figure out what should be done. Any help would be appreciated!
Thanks,
Carolina
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
解决!数据类型应为
dateType()
不是timestamptype()
,因为它是出生日期,而不是时间戳Solved! The datatype should be
DateType()
notTimestampType()
because it's a date of birth and not a timestamp