如何将X/Y轴的缩放尺度更改为绘制Pandas DataFrame中的异常值?

发布于 2025-02-11 04:43:54 字数 371 浏览 1 评论 0原文

在我试图在散点图上绘制的一组数据点中,有几个巨大的异常点。作为参考,大多数值范围在0-100之间,但偶尔有一个异常点为100000。因此,当我在散点图,盒子图或任何绘图上绘制图形时,它会缩小太大的放大以适合所有指出,范围为0-100的99%的点看起来像一个小点。有什么方法可以缩放它,以便对点的前99%进行相应的缩放,并使刻度跳到异常点的值,以使其适合图表?

这是图表的外观:

框图: “

散点图: “散点图”

In a set of datapoints I am trying to graph on a scatterplot, there are a couple of huge anomaly points. For reference, most values range between 0-100 but occasionally there is an anomalous point of 100000. Because of this, when I graph on a scatterplot, box plot, or any plot that is, it zooms out so much to fit in all the points that the 99% of the points that range between 0-100 just looks like a tiny dot. Is there any way I can scale it so that the first 99% of the points are scaled accordingly and have the scale skip to the anomaly point's value so it fits in the graph?

Here is how the graphs look:

Box Plot:
Box Plot

Scatter Plot:
Scatter Plot

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

岁月流歌 2025-02-18 04:43:54

使用限制使用plt.axis()函数。

plt.axis([x_min,x_max,y_min,y_max])
其中x_min,x_max,y_min和y_max是两个斧头的坐标极限

Use the plt.axis() function with your limits.

plt.axis([x_min, x_max, y_min, y_max])
where x_min, x_max, y_min, and y_max are the coordinate limits for both axe

沦落红尘 2025-02-18 04:43:54

您可以将X/Y轴刻度设置为记录,也可以将X/Y上的限制设置为X/Y(例如plt.xlim(0,200))以从图表中隐藏异常:

import seaborn as sns
import matplotlib.pyplot as plt
sns.set_style('whitegrid')

plt.figure(figsize=(20,12))
data = [1,2,3,4,5,55,1,6,7,24,67,33,41,75,100_000,1_000_000]
plt.subplot(2,2,1)
plt.title('basic boxplot')
sns.boxplot(x=data, flierprops=dict(markerfacecolor='green', markersize=7))
plt.subplot(2,2,2)
plt.title('log x axis')
b = sns.boxplot(x=data, flierprops=dict(markerfacecolor='red', markersize=7))
b.set_xscale('log')
plt.subplot(2,2,3)
plt.title('basic scatter')
hue = ['outlier' if i > 100 else 'basic sample' for i in data]
sns.scatterplot(x=data, y=data, hue=hue)
plt.subplot(2,2,4)
plt.title('log x/y scatter')
s = sns.scatterplot(x=data, y=data, hue=hue)
s.set_xscale('log')
s.set_yscale('log')
plt.show()

You can set x/y axis scale to log or just set limit on x/y (with plt.xlim(0,200) for example) to hide anomalies from your chart:

import seaborn as sns
import matplotlib.pyplot as plt
sns.set_style('whitegrid')

plt.figure(figsize=(20,12))
data = [1,2,3,4,5,55,1,6,7,24,67,33,41,75,100_000,1_000_000]
plt.subplot(2,2,1)
plt.title('basic boxplot')
sns.boxplot(x=data, flierprops=dict(markerfacecolor='green', markersize=7))
plt.subplot(2,2,2)
plt.title('log x axis')
b = sns.boxplot(x=data, flierprops=dict(markerfacecolor='red', markersize=7))
b.set_xscale('log')
plt.subplot(2,2,3)
plt.title('basic scatter')
hue = ['outlier' if i > 100 else 'basic sample' for i in data]
sns.scatterplot(x=data, y=data, hue=hue)
plt.subplot(2,2,4)
plt.title('log x/y scatter')
s = sns.scatterplot(x=data, y=data, hue=hue)
s.set_xscale('log')
s.set_yscale('log')
plt.show()

enter image description here

篱下浅笙歌 2025-02-18 04:43:54

otset library 可以帮助您进行多尺度可视化以考虑大型含量超级超级距离。

示例

绘制所有数据,包括离群值,然后绘制一个仅显示非外观的变焦面板。

“带有启动轴的示例”

# adapted from https://stackoverflow.com/a/72778992/17332200
import matplotlib.pyplot as plt
import numpy as np
import outset as otst
import pandas as pd
import seaborn as sns

# Data Preparation
xdata = [1, 2, 3, 4, 5, 55, 1, 6, 7, 24, 67, 33, 41, 75, 100_000, 1_000_000]
ydata = xdata[1:] + xdata[:1]  # slightly vary from x coords for nice plot
outlier_thresh = 200

data = pd.DataFrame({"x": xdata, "y": ydata})
data["outlier"] = (data["x"] > outlier_thresh) | (data["y"] > outlier_thresh)

# Initialize an OutsetGrid object
outset_grid = otst.OutsetGrid(
    data=data,
    x="x",
    y="y",
    col="outlier",  # split plots based on outlier status
    col_order=[False],  # only magnify non-outlier data
    marqueeplot_source_kws={  # style zoom indicator elements
        "leader_stretch": 0.5,
        "leader_stretch_unit": "inches",
    },
)

outset_grid.map_dataframe(sns.scatterplot, x="x", y="y")

otst.inset_outsets(outset_grid)  # rearrange move outset axes on top of source
outset_grid.marqueeplot()  # render marquee annotations

这个两轴网格很容易被重新排列到插图中。

只需在显示或保存之前添加以下行,

# rearrange to move outset axes on top of source
otst.inset_outsets(outset_grid)

Outset.inset_outsets提供了kwarg选项,如果需要,可以微调插图放置。

安装

以安装启动库,python3 -m pip安装启动

除了上面显示的面向数据的变焦区域选择外,该库还

提供了明确的API,可以手动指定缩放区域以及许多样式和布局选项。

请参阅发明 QuickStart Guide gallery 有关更多信息。

披露:AM图书馆作者

The outset library can help make multiscale visualizations to account for large-magnitude outliers.

Example

Plot all data, including outliers and then a zoom panel showing only non-outliers.

example with outset axes

# adapted from https://stackoverflow.com/a/72778992/17332200
import matplotlib.pyplot as plt
import numpy as np
import outset as otst
import pandas as pd
import seaborn as sns

# Data Preparation
xdata = [1, 2, 3, 4, 5, 55, 1, 6, 7, 24, 67, 33, 41, 75, 100_000, 1_000_000]
ydata = xdata[1:] + xdata[:1]  # slightly vary from x coords for nice plot
outlier_thresh = 200

data = pd.DataFrame({"x": xdata, "y": ydata})
data["outlier"] = (data["x"] > outlier_thresh) | (data["y"] > outlier_thresh)

# Initialize an OutsetGrid object
outset_grid = otst.OutsetGrid(
    data=data,
    x="x",
    y="y",
    col="outlier",  # split plots based on outlier status
    col_order=[False],  # only magnify non-outlier data
    marqueeplot_source_kws={  # style zoom indicator elements
        "leader_stretch": 0.5,
        "leader_stretch_unit": "inches",
    },
)

outset_grid.map_dataframe(sns.scatterplot, x="x", y="y")

otst.inset_outsets(outset_grid)  # rearrange move outset axes on top of source
outset_grid.marqueeplot()  # render marquee annotations

This two-axes grid can easily be rearranged into an inset.

example plot with inset

Just add the following line before showing or saving,

# rearrange to move outset axes on top of source
otst.inset_outsets(outset_grid)

outset.inset_outsets provides kwarg options to fine tune inset placement, if desired.

Installation

To install the outset library, python3 -m pip install outset.

Additional Features

In addition to data-oriented zoom area selection shown above, the library also provides an explicit API to manually specify zoom areas as well as many styling and layout options.

Refer to the outset quickstart guide and gallery for more info.

Disclosure: am library author

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文