取决于聚类成员资格的颜色图散点图

发布于 2025-01-16 16:48:22 字数 267 浏览 5 评论 0原文

我对数据集进行软聚类,我想创建一个很酷的图形,看起来与发布的图像相似。我想以图形形式显示两个(或更多集群)之间的数据点成员资格。不过我不太确定该怎么做。我已经使用标准为数据点分配颜色,但不确定如何创建如下所示的更动态的图形。任何帮助表示赞赏。

输入图片此处描述

Im conducting soft clustering on a data set and I wanted to create a cool graphic that looks similar to the image posted. I want to show a data points membership between two (or more clusters) in graphical form. Im not really sure how to go about this however. Ive used criteria to assign colours to a data point, but am unsure how to create a more dynamic sort of graphic seen below. Any help appreciated.

enter image description here

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

请帮我爱他 2025-01-23 16:48:23

我认为 标记 正是您要寻找的东西:

x1 = y1 = 1
x2 = y2 = 2

dx = np.random.rand(10)
dy = np.random.rand(10)

x = np.array([x1 + dx, x2 + dx]).ravel()
y = np.array([y1 + dy, y2 + dy]).ravel()

threshold = 4
markers = np.array(["o" if xy > threshold else "h" for xy in x + y])


fig, ax = plt.subplots()
for marker in np.unique(markers):
    index = markers == marker 
    ax.scatter(x[index], y[index], marker=marker)

在此处输入图像描述

添加一些额外的代码来控制颜色和透明度(alpha)

import numpy as np
import matplotlib.pyplot as plt


x1 = y1 = 1
x2 = y2 = 2

dx = np.random.rand(10)
dy = np.random.rand(10)

x = np.array([x1 + dx, x2 + dx]).ravel()
y = np.array([y1 + dy, y2 + dy]).ravel()

threshold = 4
markers = np.array(["o" if xy > threshold else "h" for xy in x + y])

blue_color = "midnightblue" # predefined
pink_color = "orchid"  
colors = [blue_color if marker == "o" else pink_color for marker in markers]

alphas = np.array([abs(xy - threshold) for xy in x + y])
alphas = 1 - alphas/np.max(alphas) 


fig, ax = plt.subplots()
for i in range(len(x)):
    ax.scatter(x[i], y[i], marker=markers[i], color=colors[i], alpha=alphas[i])

在此处输入图像描述

I think markers are just the thing your looking for:

x1 = y1 = 1
x2 = y2 = 2

dx = np.random.rand(10)
dy = np.random.rand(10)

x = np.array([x1 + dx, x2 + dx]).ravel()
y = np.array([y1 + dy, y2 + dy]).ravel()

threshold = 4
markers = np.array(["o" if xy > threshold else "h" for xy in x + y])


fig, ax = plt.subplots()
for marker in np.unique(markers):
    index = markers == marker 
    ax.scatter(x[index], y[index], marker=marker)

enter image description here

Adding someaditional code to control color and transparency (alpha)

import numpy as np
import matplotlib.pyplot as plt


x1 = y1 = 1
x2 = y2 = 2

dx = np.random.rand(10)
dy = np.random.rand(10)

x = np.array([x1 + dx, x2 + dx]).ravel()
y = np.array([y1 + dy, y2 + dy]).ravel()

threshold = 4
markers = np.array(["o" if xy > threshold else "h" for xy in x + y])

blue_color = "midnightblue" # predefined
pink_color = "orchid"  
colors = [blue_color if marker == "o" else pink_color for marker in markers]

alphas = np.array([abs(xy - threshold) for xy in x + y])
alphas = 1 - alphas/np.max(alphas) 


fig, ax = plt.subplots()
for i in range(len(x)):
    ax.scatter(x[i], y[i], marker=markers[i], color=colors[i], alpha=alphas[i])

enter image description here

萌化 2025-01-23 16:48:23

scikit-learn 中的 GaussianMixture 所做的事情与问题所要求的很接近。

具体来说,predict_proba(X) 返回一个数组,其中包含 X 中每个点属于该分量的概率。在下面的示例中,我们拟合了两个混合组件,因此最后两个图应该彼此相反:

from sklearn.mixture import GaussianMixture
from sklearn.datasets import make_moons
import matplotlib.pyplot as plt

X, _ = make_moons(noise=0.05)

mix = GaussianMixture(n_components=2).fit(X)
probs = mix.predict_proba(X)

fig, ax = plt.subplots(1, 3, sharey=True)
ax[0].scatter(X[:, 0], X[:, 1])
ax[1].scatter(X[:, 0], X[:, 1], c=probs[:, 0])
ax[2].scatter(X[:, 0], X[:, 1], c=probs[:, 1])
plt.show()

合成卫星数据集的三个散点图。左边显示原始数据,中间显示属于簇 0 的概率,右边与中间完全相反。

The GaussianMixture in scikit-learn does something close to what the question asks.

Specifically, predict_proba(X) returns an array with the probability of each point in X belonging to the component. In the example below we fit two mixture components, so the last two plots should be opposites of each other:

from sklearn.mixture import GaussianMixture
from sklearn.datasets import make_moons
import matplotlib.pyplot as plt

X, _ = make_moons(noise=0.05)

mix = GaussianMixture(n_components=2).fit(X)
probs = mix.predict_proba(X)

fig, ax = plt.subplots(1, 3, sharey=True)
ax[0].scatter(X[:, 0], X[:, 1])
ax[1].scatter(X[:, 0], X[:, 1], c=probs[:, 0])
ax[2].scatter(X[:, 0], X[:, 1], c=probs[:, 1])
plt.show()

Three scatter plots of the synthetic moons data set. Left shows the original data, middle shows probability of being in cluster 0, right is the exact opposite of the middle.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文