处理“查看与复制”在熊猫中

发布于 2025-02-11 06:11:03 字数 549 浏览 1 评论 0原文

我正在尝试通过以下代码来解码我的数据框架：

 df = pd.read_sql_table('mytable',con) 


 for column in df.columns :
     for i in range(len(df[column])):
         if type(df[column][i]) == bytearray or type(df[column][i]) == bytes:
             df[column][i] = str(df[column][i], 'utf-8')

但是无论我尝试什么，我都会继续进行设置WITHCOPY警告

。

更新：

我最终为此解决了：

if df[column].dtype == 'object':
    df[column] = df[column].apply(lambda x: x.decode('utf-8') if isinstance(x, bytes) else x)

感谢您的帮助！

原文

I'm trying to decode my dataframe through the following code:

 df = pd.read_sql_table('mytable',con) 


 for column in df.columns :
     for i in range(len(df[column])):
         if type(df[column][i]) == bytearray or type(df[column][i]) == bytes:
             df[column][i] = str(df[column][i], 'utf-8')

but I keep getting SettingWithCopy warnings no matter what I try

Anyone know how to deal with this warning ?

UPDATE:

I've end up settling for this:

if df[column].dtype == 'object':
    df[column] = df[column].apply(lambda x: x.decode('utf-8') if isinstance(x, bytes) else x)

Thanks for the help!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

橪书 2025-02-18 06:11:03

改进这一点的几种方法：

看起来您正在将整列转换为字符串，因此您无需循环浏览列的每个值。
您可以使用Insty pd.series.astype（）方法，该方法比vectorized（即您可以在整个系列中称呼它）更有效地效率。
使用.loc避免使用复制警告设置。

因此，您的代码看起来像：

 for column in df.columns :
    df.loc[column, :] = df[column].astype(str)

请注意，str类型将在除python的所有旧版本中编码为utf-8。但是，如果您使用的是2.x，则可以执行df [列] .astype（'Unicode'）。

A few ways to improve this:

It looks like you are converting the whole column to string so you don't need to loop through each value of the column.
You can use the inbuilt pd.Series.astype() method which is more efficient than str() as it is vectorized (i.e. you can call it on the whole Series).
Use .loc to avoid the setting with copy warning.

So your code will look like:

 for column in df.columns :
    df.loc[column, :] = df[column].astype(str)

Note that str type will be encoded as utf-8 in all but very old versions of Python. However if you are using 2.x you can do df[column].astype('unicode').

回复收藏 0 原文

~没有更多了~