如何在 Plotly 中使用 Polars 而不转换为 Pandas?

发布于 2025-01-19 05:53:27 字数 896 浏览 2 评论 0原文

我想用 Polars 替换 Pandas,但我无法找到如何在不转换为 Pandas 的情况下将 Polars 与 Plotly 一起使用。我想知道是否有一种方法可以将 Pandas 完全排除在这个过程之外。

考虑以下测试数据:

import polars as pl
import numpy as np
import plotly.express as px

df = pl.DataFrame(
    {
        "nrs": [1, 2, 3, None, 5],
        "names": ["foo", "ham", "spam", "egg", None],
        "random": np.random.rand(5),
        "groups": ["A", "A", "B", "C", "B"],
    }
)

fig = px.bar(df, x='names', y='random')
fig.show()

我希望此代码在 Jupyter 笔记本中显示条形图,但它返回错误:

/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/polars/internals/frame.py:1483: UserWarning: accessing series as Attribute of a DataFrame is deprecated
  warnings.warn("accessing series as Attribute of a DataFrame is deprecated")

可以使用 df = df.to_pandas 将 Polars 数据帧转换为 Pandas 数据帧()。然后,它就起作用了。然而,还有另一种更简单、更优雅的解决方案吗?

I would like to replace Pandas with Polars but I was not able to find out how to use Polars with Plotly without converting to Pandas. I wonder if there is a way to completely cut Pandas out of the process.

Consider the following test data:

import polars as pl
import numpy as np
import plotly.express as px

df = pl.DataFrame(
    {
        "nrs": [1, 2, 3, None, 5],
        "names": ["foo", "ham", "spam", "egg", None],
        "random": np.random.rand(5),
        "groups": ["A", "A", "B", "C", "B"],
    }
)

fig = px.bar(df, x='names', y='random')
fig.show()

I would like this code to show the bar chart in a Jupyter notebook but instead it returns an error:

/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/polars/internals/frame.py:1483: UserWarning: accessing series as Attribute of a DataFrame is deprecated
  warnings.warn("accessing series as Attribute of a DataFrame is deprecated")

It is possible to transform the Polars data frame to a Pandas data frame with df = df.to_pandas(). Then, it works. However, is there another, simpler and more elegant solution?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

半步萧音过轻尘 2025-01-26 05:53:27

是的,不需要转换为 Pandas 数据帧。有人 (sa-) 请求支持更好的选项 此处 并包含了解决方法。

“我现在使用的解决方法是 px.line(x=df["a"], y=df["b"]),但如果数据框的名称太大,它会变得笨拙”

对于 OP代码示例,显式指定数据帧列的方法有效。
我发现除了用 px.bar(x=df["names"], y=df["random"]) 指定数据框列 - 或 - px.bar(df , x=df["names"], y=df["random"]),转换到列表也可以工作:

import polars as pl
import numpy as np
import plotly.express as px

df = pl.DataFrame(
    {
        "nrs": [1, 2, 3, None, 5],
        "names": ["foo", "ham", "spam", "egg", None],
        "random": np.random.rand(5),
        "groups": ["A", "A", "B", "C", "B"],
    }
)

px.bar(df, x=list(df["names"]), y=list(df["random"]))

更好地了解极坐标,一旦您看到解决方法的想法,您可能会看到一些其他选项。

发布的示例更简单,而不是px.line(df, x="a", y="b") 就像你可以用于 Pandas 数据框一样,你使用px.line(x=df["a"], y=df["b"])。对于极坐标,即:(

import polars as pl
import plotly.express as px

df = pl.DataFrame({"a":[1,2,3,4,5], "b":[1,4,9,16,25]})

px.line(x=df["a"], y=df["b"])

请注意,使用 plotly.express 需要安装 Pandas,请参阅此处 和这里plotly.express 在我的答案中,因为它更接近OP。如果有需要的话,代码可以调整为使用plotly.graph_objects完全不安装 Pandas。)

Yes, no need for converting to a Pandas dataframe. Someone (sa-) has requested supporting a better option here and included a workaround for it.

"The workaround that I use right now is px.line(x=df["a"], y=df["b"]), but it gets unwieldy if the name of the data frame is too big"

For the OP's code example, the approach of specifying the dataframe columns explicitly works.
I find in addition to specifying the dataframe columns with px.bar(x=df["names"], y=df["random"]) - or - px.bar(df, x=df["names"], y=df["random"]), casting to a list can also work:

import polars as pl
import numpy as np
import plotly.express as px

df = pl.DataFrame(
    {
        "nrs": [1, 2, 3, None, 5],
        "names": ["foo", "ham", "spam", "egg", None],
        "random": np.random.rand(5),
        "groups": ["A", "A", "B", "C", "B"],
    }
)

px.bar(df, x=list(df["names"]), y=list(df["random"]))

Knowing polars better, you may see some other options once you see the idea of the workaround.

The example posted there is simpler, instead of px.line(df, x="a", y="b") like you could use for a Pandas dataframe, you use px.line(x=df["a"], y=df["b"]). With polars, that is:

import polars as pl
import plotly.express as px

df = pl.DataFrame({"a":[1,2,3,4,5], "b":[1,4,9,16,25]})

px.line(x=df["a"], y=df["b"])

(Note that using plotly.express requires Pandas to be installed, see here and here. I used plotly.express in my answer because it was closer to the OP. The code could be adapted to using plotly.graph_objects if there was a desire to not have Pandas installed & involved at all.)

白昼 2025-01-26 05:53:27

仅供参考:plotly-express href =“ https://github.com/plotly/plotly.py.py/pull/4790” rel =“ nofollow noreferrer”>刚刚合并 DataFrame支持(通过 narwhals ),这意味着极点将是 nastily 受支持的人,因此没有更多的转换到引擎盖下的大熊猫(您可能怀疑,使用圆柱框架时会带来不错的绘图性能提升)。

FYI: plotly-express has just merged generic DataFrame support (via narwhals), meaning that Polars will be natively supported, so no more transforms to Pandas under the hood (and, as you might suspect, this comes with a nice plotting performance boost when using a Polars frame).

哥,最终变帅啦 2025-01-26 05:53:27

目前可以从熊猫转移到pola.rs。从我的研究中,您的[]将起作用,但被认为是Porars中的反诉讼。作者建议您使用.to_series方法。

px.pie(df,                                   # Polars DataFrame
   names = df.select('Model').to_series(),
   values = df.select('Sales').to_series(), 
   hover_name = df.select('Model').to_series(),
   color_discrete_sequence= px.colors.sequential.Plasma_r)

httpps:> plotly-express-8DA4357D2EE0

当涉及极性数据框架的可视化时,似乎您不能完全摆脱pandas dataframe的转换。

希望这有所帮助

Currently making the switch to pola.rs from pandas. From my research your [] will work but is considered an anti-pattern in polars. This author suggests that you use the .to_series method.

px.pie(df,                                   # Polars DataFrame
   names = df.select('Model').to_series(),
   values = df.select('Sales').to_series(), 
   hover_name = df.select('Model').to_series(),
   color_discrete_sequence= px.colors.sequential.Plasma_r)

https://towardsdatascience.com/visualizing-polars-dataframes-using-plotly-express-8da4357d2ee0

When it comes to visualization of polar dataframe it seems you can't totally be rid of pandas dataframe conversion.

Hope this helped

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文