我如何处理类型的“ type”对象' timestamp'不是JSON序列化的吗?在python / pandas中?

发布于 2025-02-05 18:31:17 字数 4324 浏览 3 评论 0 原文

序言:Python的新鲜事物,但感谢SO帮助!

以下是一个代码段,我试图在MSSQL Server表上执行SQL查询,然后将其发布回Google表。我能够检索数据和标题,我想我几乎已经弄清楚了。但是,我在某些列有的DateTime格式上遇到了一些麻烦。我收到的错误是:

Traceback (most recent call last):
  File "modelhome.py", line 153, in <module>
    valueInputOption=value_input_option, insertDataOption=insert_data_option, body=value_range_body)
  File "C:\ProgramData\Anaconda3\lib\site-packages\googleapiclient\discovery.py", line 785, in method
    actual_path_params, actual_query_params, body_value)
  File "C:\ProgramData\Anaconda3\lib\site-packages\googleapiclient\model.py", line 151, in request
    body_value = self.serialize(body_value)
  File "C:\ProgramData\Anaconda3\lib\site-packages\googleapiclient\model.py", line 260, in serialize
    return json.dumps(body_value)
  File "C:\ProgramData\Anaconda3\lib\json\__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "C:\ProgramData\Anaconda3\lib\json\encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "C:\ProgramData\Anaconda3\lib\json\encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "C:\ProgramData\Anaconda3\lib\json\encoder.py", line 180, in default
    o.__class__.__name__)
TypeError: Object of type 'Timestamp' is not JSON serializable

dfdata 中的代码段

"""Execute SQL Statement, create table, and append back to Google Sheet"""
# SQL Server Connection
server = '[SQLServerIP]'
database = '[SQLServerDatabase]'
username = '[SQLServerUsername]'
password = '[SQLServerPassword]'
cnxn = pyodbc.connect('Driver={ODBC Driver 13 for SQL Server};SERVER=' +
                      server+';DATABASE='+database+';UID='+username+';PWD='+password)

# Sample SQL Query to get Data
sql = 'select * from tblName'
cursor = cnxn.cursor()
cursor.execute(sql)
list(cursor.fetchall())

# Pandas reading values from SQL query, and building table
sqlData = pandas.read_sql_query(sql, cnxn)

# Pandas building dataframe, and exporting .xlsx copy of table
df = DataFrame(data=sqlData)

df.to_excel('tblName.xlsx',
            header=True, index=False)
dfHeaders = df.columns.values.tolist()
dfHeadersArray = [dfHeaders]
dfData = df.values.tolist()
dfDataFormatted = [dfData]
"""Writing to Google Sheet Range"""
print(dfHeaders)
print(dfData)

# How the input data should be interpreted.
value_input_option = 'USER_ENTERED'  # TODO: Update placeholder value.

# How the input data should be inserted.
insert_data_option = 'OVERWRITE'  # TODO: Update placeholder value.

value_range_body = {
    "majorDimension": "ROWS",
    "values":
    dfHeadersArray + dfDataFormatted
}

request = service.spreadsheets().values().append(spreadsheetId=spreadsheetId, range=SQLRangeName,
                                                 valueInputOption=value_input_option, insertDataOption=insert_data_option, body=value_range_body)
response = request.execute()

,正常字符串如下:

datettime条目出现这样的条目:

我的理解是,JSON没有一种本机处理此数据类型的方法,并且必须将其作为例外。有没有办法将数据集的所有时间戳部分序列化,而无需指定哪些列是DateTime?

大家都可以提供的任何帮助/建议将不胜感激。

谢谢!

最终解决方案更新 - 信用:@chrisheinze

添加Datetteme标头的以下数据框架建模非常有效。

# Pandas reading values from SQL query, and building table
sqlData = pandas.read_sql_query(sql, cnxn)

# Pandas building dataframe, and exporting .xlsx copy of table
df = DataFrame(data=sqlData)

# Google Sheets API can't handle date/time. Below converts certain headers to formatted text strings.
df['Date'] = df['Date'].dt.strftime('%m/%d/%Y')
df['DateTime'] = df['DateTime'].dt.strftime('%m/%d/%Y %H:%M:%S')
df['RDD'] = df['RDD'].dt.strftime('%m/%d/%Y')
df['DateTimeErrorTable'] = df['DateTimeErrorTable'].dt.strftime('%m/%d/%Y %H:%M:%S')
df['DateTimeSuccessTable'] = df['DateTimeSuccessTable'].dt.strftime('%m/%d/%Y %H:%M:%S')
df['WorkedOn'] = df['WorkedOn'].dt.strftime('%m/%d/%Y %H:%M:%S')
df['EmailSentOn'] = df['EmailSentOn'].dt.strftime('%m/%d/%Y %H:%M:%S')

希望它能帮助别人!

Preface: Extremely new at Python, but thankful for the SO help!

Below is a code snippet where I'm trying to execute a SQL query against a MSSQL server table, and post it back to Google Sheets. I am able to retrieve data and headers, and I think I almost have it figured out. However, I'm having some trouble with the datetime format that some of the columns have. The error I'm receiving is:

Traceback (most recent call last):
  File "modelhome.py", line 153, in <module>
    valueInputOption=value_input_option, insertDataOption=insert_data_option, body=value_range_body)
  File "C:\ProgramData\Anaconda3\lib\site-packages\googleapiclient\discovery.py", line 785, in method
    actual_path_params, actual_query_params, body_value)
  File "C:\ProgramData\Anaconda3\lib\site-packages\googleapiclient\model.py", line 151, in request
    body_value = self.serialize(body_value)
  File "C:\ProgramData\Anaconda3\lib\site-packages\googleapiclient\model.py", line 260, in serialize
    return json.dumps(body_value)
  File "C:\ProgramData\Anaconda3\lib\json\__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "C:\ProgramData\Anaconda3\lib\json\encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "C:\ProgramData\Anaconda3\lib\json\encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "C:\ProgramData\Anaconda3\lib\json\encoder.py", line 180, in default
    o.__class__.__name__)
TypeError: Object of type 'Timestamp' is not JSON serializable

Code Snippet

"""Execute SQL Statement, create table, and append back to Google Sheet"""
# SQL Server Connection
server = '[SQLServerIP]'
database = '[SQLServerDatabase]'
username = '[SQLServerUsername]'
password = '[SQLServerPassword]'
cnxn = pyodbc.connect('Driver={ODBC Driver 13 for SQL Server};SERVER=' +
                      server+';DATABASE='+database+';UID='+username+';PWD='+password)

# Sample SQL Query to get Data
sql = 'select * from tblName'
cursor = cnxn.cursor()
cursor.execute(sql)
list(cursor.fetchall())

# Pandas reading values from SQL query, and building table
sqlData = pandas.read_sql_query(sql, cnxn)

# Pandas building dataframe, and exporting .xlsx copy of table
df = DataFrame(data=sqlData)

df.to_excel('tblName.xlsx',
            header=True, index=False)
dfHeaders = df.columns.values.tolist()
dfHeadersArray = [dfHeaders]
dfData = df.values.tolist()
dfDataFormatted = [dfData]
"""Writing to Google Sheet Range"""
print(dfHeaders)
print(dfData)

# How the input data should be interpreted.
value_input_option = 'USER_ENTERED'  # TODO: Update placeholder value.

# How the input data should be inserted.
insert_data_option = 'OVERWRITE'  # TODO: Update placeholder value.

value_range_body = {
    "majorDimension": "ROWS",
    "values":
    dfHeadersArray + dfDataFormatted
}

request = service.spreadsheets().values().append(spreadsheetId=spreadsheetId, range=SQLRangeName,
                                                 valueInputOption=value_input_option, insertDataOption=insert_data_option, body=value_range_body)
response = request.execute()

Within dfData, normal strings appear like this:
enter image description here

datettime entries appear like this:
enter image description here

My understanding is that the JSON doesn't have a native way to handle this data type, and it has to be handled as an exception. Is there a way to serialize all Timestamp parts of the dataset without having to specify which columns are datetime?

Any help/advice you all could provide would be greatly appreciated.

Thanks!

Final Solution Update - Credit: @chrisheinze

Adding the following dataframe modeling for datettime headers worked perfectly.

# Pandas reading values from SQL query, and building table
sqlData = pandas.read_sql_query(sql, cnxn)

# Pandas building dataframe, and exporting .xlsx copy of table
df = DataFrame(data=sqlData)

# Google Sheets API can't handle date/time. Below converts certain headers to formatted text strings.
df['Date'] = df['Date'].dt.strftime('%m/%d/%Y')
df['DateTime'] = df['DateTime'].dt.strftime('%m/%d/%Y %H:%M:%S')
df['RDD'] = df['RDD'].dt.strftime('%m/%d/%Y')
df['DateTimeErrorTable'] = df['DateTimeErrorTable'].dt.strftime('%m/%d/%Y %H:%M:%S')
df['DateTimeSuccessTable'] = df['DateTimeSuccessTable'].dt.strftime('%m/%d/%Y %H:%M:%S')
df['WorkedOn'] = df['WorkedOn'].dt.strftime('%m/%d/%Y %H:%M:%S')
df['EmailSentOn'] = df['EmailSentOn'].dt.strftime('%m/%d/%Y %H:%M:%S')

Hope it helps someone else!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

过气美图社 2025-02-12 18:31:17

Sheets API不知道该如何处理Python DateTime/Timestamp。您需要将其转换 - 很可能是str。

要转换大熊猫系列,请使用

如果仅适用于需要转换的单个值,则使用datetime的

# To convert a datetime column to a str. 

df['date_column'] = df['date_column'].dt.strftime('%Y%m%d%H%M%S')

​>表示“字符串格式DateTime”。这使您可以将DateTime/Timestamp值格式化为str。 '%y%m%d%h%m%s'是您想要的输出。在我的示例中,结果将是“ 20180309152303”。另一个示例是'%m/%d/%y%h:%m:%s',它会给您“ 03/09/2018 15:23:03”。因此,在我的示例中,用您的日期列的名称替换“ date_column”,然后将其转换为与API兼容的str,并在Google表中理解格式。

The Sheets API doesn't know what to do with a Python datetime/timestamp. You'll need to convert it - most likely to a str.

For converting a pandas Series use pd.Series.dt.strftime()

If it's just for a single value that needs to be converted then use datetime's strftime()

Edit to answer your question in the comments:

# To convert a datetime column to a str. 

df['date_column'] = df['date_column'].dt.strftime('%Y%m%d%H%M%S')

To give a bit more info, strftime means "string format datetime". This allows you to format your datetime/timestamp value into a str. The '%Y%m%d%H%M%S' is what you want the output the be. In my example, the results would be "20180309152303" for your date. Another example would be '%m/%d/%Y %H:%M:%S' which would give you "03/09/2018 15:23:03". So replace 'date_column' in my example with the name of your date column and it'll be converted to a str that's compatible with the API as well as understood format-wise in Google Sheets.

看轻我的陪伴 2025-02-12 18:31:17

如果您无法分辨哪个列为 date ,请使用此功能:

import numpy as np

def cast_for_gsheets(df):
    # casting as string if not serializable
    for column, dt in zip(df.columns, df.dtypes):
        if dt.type not in [
            np.int64,
            np.float_,
            np.bool_,
        ]:
            df.loc[:, column] = df[column].astype(str)
    return df

df = cast_for_gsheets(DataFrame(data=sqlData))

In case you can't tell which column is a date, use this function:

import numpy as np

def cast_for_gsheets(df):
    # casting as string if not serializable
    for column, dt in zip(df.columns, df.dtypes):
        if dt.type not in [
            np.int64,
            np.float_,
            np.bool_,
        ]:
            df.loc[:, column] = df[column].astype(str)
    return df

df = cast_for_gsheets(DataFrame(data=sqlData))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文