自组织映射(SOM)中的降维问题

发布于 2024-12-15 06:50:35 字数 150 浏览 6 评论 0原文

自组织地图据称能够在较小的维度空间上可视化/聚类高维数据。我对这个说法的理解有些困难。

考虑六维数据集,码本向量/参考向量也是六维的。根据SOM算法,这些参考向量的更新也是在六维向量空间中进行的。如果我们考虑二维地图,我应该如何理解六维数据空间和二维地图空间之间的映射?

Self organizing map is claimed to be able to visualize/cluster the high-dimensional data on a smaller dimensional space. I have some difficulties in understanding this statement.

Consider a six-dimensional data set, the codebook vector/reference vector is also of six-dimensional. According to the SOM algorithm, updating these reference vectors are also conducted in the six-dimensional vector space. If we are considering a two dimensional map, how should I understand the map between the six-dimensional data space and two-dimensional map space?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

雪花飘飘的天空 2024-12-22 06:50:35

N 维输入空间和 2D SOM 空间之间的映射是一个非线性投影尽可能多的拓扑。
这意味着有关距离和角度的信息在此过程中丢失,但保留了点之间的邻近关系(即输入空间中彼此接近的 2 个点在 SOM 空间中应该接近)。
我对“SOM 的作用是什么?”有了最深入的了解。在 3D RGB 色彩空间上使用它:SOM 的工作在这种情况下可以很容易地形象化,并且应该有助于理解这个概念。

The map between the N-dimensional input space and the 2D SOM space is a non-linear projection preserving as much of the topology as possible.
It means that information about distance and angle is lost in the process but that proximity relationship between points is preserved (i.e. 2 points which are close one to another in the input space should be close in the SOM space).
I got my best insight in "what does a SOM do?" by using it on the 3D RGB color space: the work of the SOM can easily be visualized in this case and should help to grasp the concept.

内心激荡 2024-12-22 06:50:35

2D 自组织映射 (SOM) 将输入向量分布到 2D 平面上。从数学上讲,SOM 是一个三维矩阵,第三维的长度由输入数据的长度给出。为了可视化 SOM,通常需要计算 U 矩阵。 U 矩阵为 SOM 的每个神经元给出了所考虑的神经元与其邻居之间的平均欧几里德距离。
U-matrix
生成的 2D 矩阵允许将高维空间可视化到 2D 平面上。高值给出了簇之间的障碍,在下图中表示为深蓝色山谷:
U-matrix_example
这个U矩阵来自于这个3D数据集的学习:
在此处输入图像描述
这里是 3D 原始空间中的 U 矩阵:
在此处输入图像描述

The 2D self organizing map (SOM) distributes the input vectors onto a 2D plane. Mathematically the SOM is a 3D matrix and the length of the third dimension is given by the length of your input data. To visualize the SOM it's usual to compute the U-matrix. The U-matrix gives for each neuron of the SOM the mean Euclidean distance between the considered neuron and its neighbors.
U-matrix
The resulting 2D matrix allows the visualization of the high dimensional space onto a 2D plane. The high values give barrier between clusters represented as deep blue valleys in the following figure:
U-matrix_example
This U-matrix comes from the learning on this 3D data set:
enter image description here
And here the U-matrix in the 3D original space:
enter image description here

何必那么矫情 2024-12-22 06:50:35

您无法理解它,但可以使用它,因此您可以尝试将其视为一个离散函数,可以将 4d 向量空间映射到 1d 向量。最重要的是你的函数是某种递归。例如,L 系统大量使用递归或重复。有关怪物曲线的更好描述可以在 Nick 的空间索引希尔伯特曲线博客中找到。

You cannot understand it but it's possible to use it so you can try to think of it as a discrete function that can map for example a 4d vector space to a 1d vector. Most important is that your function is some sort of recursion. A L-system for example uses recursion or repetition a lot. A better description about monster curves can be found here at Nick' spatial index hilbert curve blog.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文