如何在 Plotly 中使用 Polars 而不转换为 Pandas?
我想用 Polars 替换 Pandas,但我无法找到如何在不转换为 Pandas 的情况下将 Polars 与 Plotly 一起使用。我想知道是否有一种方法可以将 Pandas 完全排除在这个过程之外。
考虑以下测试数据:
import polars as pl
import numpy as np
import plotly.express as px
df = pl.DataFrame(
{
"nrs": [1, 2, 3, None, 5],
"names": ["foo", "ham", "spam", "egg", None],
"random": np.random.rand(5),
"groups": ["A", "A", "B", "C", "B"],
}
)
fig = px.bar(df, x='names', y='random')
fig.show()
我希望此代码在 Jupyter 笔记本中显示条形图,但它返回错误:
/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/polars/internals/frame.py:1483: UserWarning: accessing series as Attribute of a DataFrame is deprecated
warnings.warn("accessing series as Attribute of a DataFrame is deprecated")
可以使用 df = df.to_pandas 将 Polars 数据帧转换为 Pandas 数据帧()。然后,它就起作用了。然而,还有另一种更简单、更优雅的解决方案吗?
I would like to replace Pandas with Polars but I was not able to find out how to use Polars with Plotly without converting to Pandas. I wonder if there is a way to completely cut Pandas out of the process.
Consider the following test data:
import polars as pl
import numpy as np
import plotly.express as px
df = pl.DataFrame(
{
"nrs": [1, 2, 3, None, 5],
"names": ["foo", "ham", "spam", "egg", None],
"random": np.random.rand(5),
"groups": ["A", "A", "B", "C", "B"],
}
)
fig = px.bar(df, x='names', y='random')
fig.show()
I would like this code to show the bar chart in a Jupyter notebook but instead it returns an error:
/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/polars/internals/frame.py:1483: UserWarning: accessing series as Attribute of a DataFrame is deprecated
warnings.warn("accessing series as Attribute of a DataFrame is deprecated")
It is possible to transform the Polars data frame to a Pandas data frame with df = df.to_pandas()
. Then, it works. However, is there another, simpler and more elegant solution?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
是的,不需要转换为 Pandas 数据帧。有人 (sa-) 请求支持更好的选项 此处 并包含了解决方法。
对于 OP代码示例,显式指定数据帧列的方法有效。
我发现除了用
px.bar(x=df["names"], y=df["random"])
指定数据框列 - 或 -px.bar(df , x=df["names"], y=df["random"])
,转换到列表也可以工作:更好地了解极坐标,一旦您看到解决方法的想法,您可能会看到一些其他选项。
发布的示例有更简单,而不是
px.line(df, x="a", y="b")
就像你可以用于 Pandas 数据框一样,你使用px.line(x=df["a"], y=df["b"])
。对于极坐标,即:(请注意,使用
plotly.express
需要安装 Pandas,请参阅此处 和这里。plotly.express
在我的答案中,因为它更接近OP。如果有需要的话,代码可以调整为使用plotly.graph_objects
完全不安装 Pandas。)Yes, no need for converting to a Pandas dataframe. Someone (sa-) has requested supporting a better option here and included a workaround for it.
For the OP's code example, the approach of specifying the dataframe columns explicitly works.
I find in addition to specifying the dataframe columns with
px.bar(x=df["names"], y=df["random"])
- or -px.bar(df, x=df["names"], y=df["random"])
, casting to a list can also work:Knowing polars better, you may see some other options once you see the idea of the workaround.
The example posted there is simpler, instead of
px.line(df, x="a", y="b")
like you could use for a Pandas dataframe, you usepx.line(x=df["a"], y=df["b"])
. With polars, that is:(Note that using
plotly.express
requires Pandas to be installed, see here and here. I usedplotly.express
in my answer because it was closer to the OP. The code could be adapted to usingplotly.graph_objects
if there was a desire to not have Pandas installed & involved at all.)仅供参考:
plotly-express
href =“ https://github.com/plotly/plotly.py.py/pull/4790” rel =“ nofollow noreferrer”>刚刚合并 DataFrame支持(通过 narwhals ),这意味着极点将是 nastily 受支持的人,因此没有更多的转换到引擎盖下的大熊猫(您可能怀疑,使用圆柱框架时会带来不错的绘图性能提升)。FYI:
plotly-express
has just merged generic DataFrame support (via narwhals), meaning that Polars will be natively supported, so no more transforms to Pandas under the hood (and, as you might suspect, this comes with a nice plotting performance boost when using a Polars frame).目前可以从熊猫转移到pola.rs。从我的研究中,您的[]将起作用,但被认为是Porars中的反诉讼。作者建议您使用.to_series方法。
httpps:> plotly-express-8DA4357D2EE0
当涉及极性数据框架的可视化时,似乎您不能完全摆脱pandas dataframe的转换。
希望这有所帮助
Currently making the switch to pola.rs from pandas. From my research your [] will work but is considered an anti-pattern in polars. This author suggests that you use the .to_series method.
https://towardsdatascience.com/visualizing-polars-dataframes-using-plotly-express-8da4357d2ee0
When it comes to visualization of polar dataframe it seems you can't totally be rid of pandas dataframe conversion.
Hope this helped