处理“查看与复制”在熊猫中
我正在尝试通过以下代码来解码我的数据框架:
df = pd.read_sql_table('mytable',con)
for column in df.columns :
for i in range(len(df[column])):
if type(df[column][i]) == bytearray or type(df[column][i]) == bytes:
df[column][i] = str(df[column][i], 'utf-8')
但是无论我尝试什么,我都会继续进行设置WITHCOPY警告
。
更新:
我最终为此解决了:
if df[column].dtype == 'object':
df[column] = df[column].apply(lambda x: x.decode('utf-8') if isinstance(x, bytes) else x)
感谢您的帮助!
I'm trying to decode my dataframe through the following code:
df = pd.read_sql_table('mytable',con)
for column in df.columns :
for i in range(len(df[column])):
if type(df[column][i]) == bytearray or type(df[column][i]) == bytes:
df[column][i] = str(df[column][i], 'utf-8')
but I keep getting SettingWithCopy warnings no matter what I try
Anyone know how to deal with this warning ?
UPDATE:
I've end up settling for this:
if df[column].dtype == 'object':
df[column] = df[column].apply(lambda x: x.decode('utf-8') if isinstance(x, bytes) else x)
Thanks for the help!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
改进这一点的几种方法:
pd.series.astype()
方法,该方法比vectorized(即您可以在整个系列中称呼它)更有效地效率。.loc
避免使用复制警告设置。因此,您的代码看起来像:
请注意,
str
类型将在除python的所有旧版本中编码为utf-8
。但是,如果您使用的是2.x,则可以执行df [列] .astype('Unicode')
。A few ways to improve this:
pd.Series.astype()
method which is more efficient thanstr()
as it is vectorized (i.e. you can call it on the whole Series)..loc
to avoid the setting with copy warning.So your code will look like:
Note that
str
type will be encoded asutf-8
in all but very old versions of Python. However if you are using 2.x you can dodf[column].astype('unicode')
.