Java MDSJ 生成 NaN
有人有 MDSJ 的经验吗?以下输入仅产生 NaN 结果,我不明白为什么。文档非常稀疏。
import mdsj.Data;
import mdsj.MDSJ;
public class MDSJDemo {
public static void main(String[] args) {
double[][] input = {
{78.0, 60.0, 30.0, 25.0, 24.0, 7.125, 1600.0, 1.4953271028037383, 15.0, 60.0, 0.0, 0.0, 50.0},
{63.1578947368421, 51.81818181818182, 33.0, 30.0, 10.714285714285715, 6.402877697841727, 794.2857142857143, 0.823045267489712, 15.0, 20.0, 2.8571428571428568, 0.0, 75.0},
{55.714285714285715, 70.0, 16.363636363636363, 27.5, 6.666666666666666, 5.742574257425742, 577.1428571428571, 0.6542056074766355, 12.857142857142856, 10.0, 17.142857142857142, 0.0, 25.0}
};
int n=input[0].length; // number of data objects
double[][] output=MDSJ.classicalScaling(input); // apply MDS
System.out.println(Data.format(output));
for(int i=0; i<n; i++) { // output all coordinates
System.out.println(output[0][i]+" "+output[1][i]);
}
}
}
这是输出:
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
也许我错误地使用了 MDS。 input
中每个长度为 13 的子数组旨在表示一个对象,但 MDSJ 返回 13 个点。
此输入也失败:
double[][] input = {
{3, 4, 3},
{5, 6, 1},
{0, 1, 2}
};
编辑:看来我一直使用错误。我一直在给它这样的输入:
Object A: {30d, 1d, 0d, 4.32, 234.1}
Object B: {45d, 3.21, 45, 91.2, 9.9}
Object C: {7.7, 93.1, 401, 0d, 0d}
但它实际上想要的是这样的距离矩阵:
A B C
A 0 3 1
B 3 0 5
C 1 5 0
但不完全是这样,因为对于这个输入:
double[][] input = {
{0, 3, 1},
{3, 0, 5},
{1, 5, 0}
};
我得到这个结果:
0.8713351726043931 -2.361724203891451 2.645016918006963
NaN NaN NaN
0.8713351726043931 NaN
-2.361724203891451 NaN
2.645016918006963 NaN
但如果它确实想要一个距离数组,那么什么是首先使用 MDS 的目的是什么?我认为它应该将一系列属性归结为坐标。
Anyone have any experience with MDSJ? The following input produces only NaN results and I can't figure out why. The documentation is pretty sparse.
import mdsj.Data;
import mdsj.MDSJ;
public class MDSJDemo {
public static void main(String[] args) {
double[][] input = {
{78.0, 60.0, 30.0, 25.0, 24.0, 7.125, 1600.0, 1.4953271028037383, 15.0, 60.0, 0.0, 0.0, 50.0},
{63.1578947368421, 51.81818181818182, 33.0, 30.0, 10.714285714285715, 6.402877697841727, 794.2857142857143, 0.823045267489712, 15.0, 20.0, 2.8571428571428568, 0.0, 75.0},
{55.714285714285715, 70.0, 16.363636363636363, 27.5, 6.666666666666666, 5.742574257425742, 577.1428571428571, 0.6542056074766355, 12.857142857142856, 10.0, 17.142857142857142, 0.0, 25.0}
};
int n=input[0].length; // number of data objects
double[][] output=MDSJ.classicalScaling(input); // apply MDS
System.out.println(Data.format(output));
for(int i=0; i<n; i++) { // output all coordinates
System.out.println(output[0][i]+" "+output[1][i]);
}
}
}
This is the output:
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
Perhaps I'm using MDS incorrectly. Each subarray of length 13 in input
is intended to represent one object, yet MDSJ is returning 13 points.
It also fails for this input:
double[][] input = {
{3, 4, 3},
{5, 6, 1},
{0, 1, 2}
};
EDIT: It appears that I have been using it wrong. I had been giving it an input like this:
Object A: {30d, 1d, 0d, 4.32, 234.1}
Object B: {45d, 3.21, 45, 91.2, 9.9}
Object C: {7.7, 93.1, 401, 0d, 0d}
But what it actually wants is a distance matrix like this:
A B C
A 0 3 1
B 3 0 5
C 1 5 0
Not exactly, though, because for this input:
double[][] input = {
{0, 3, 1},
{3, 0, 5},
{1, 5, 0}
};
I get this result:
0.8713351726043931 -2.361724203891451 2.645016918006963
NaN NaN NaN
0.8713351726043931 NaN
-2.361724203891451 NaN
2.645016918006963 NaN
But if it does want an array of distances, what is the point of using MDS in the first place? I thought it was supposed to boil an array of attributes down into coordinates.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
多维缩放将距离转换为坐标 - 如果您已经在高维空间中拥有坐标并希望将它们最佳地嵌入到低维空间中,那么主成分分析 (PCA) 可能就是您正在寻找的技术。
经典MDS和PCA密切相关:首先,MDS将输入距离转换为初步的高维坐标(维度与描述的对象数量一样高);其次,通过消除最不重要的轴,在类似 PCA 的步骤中降低这些坐标的维数。
使用 MDS 的要点是,在某些设置中,输入距离不是从现有坐标导出的,而是从其他非几何的东西导出的,例如,人们做出的相异性评级。
您的 3x3 相异矩阵不遵守度量空间中所需的三角不等式(因为 d[1][0]+d[0][2]
Multidimensional Scaling turns distances into coordinates - if you already have coordinates in a high-dimensional space and want them embedded optimally in a low-dimensional space, Principal Components Analysis (PCA) is probably the technique you are looking for.
Classical MDS and PCA are closely related: First, MDS converts input distances into preliminary high-dimensional coordinates (the dimension being as high as the number of objects described); second, the dimensionality of these coordinates is reduced in a PCA-like step by getting rid of the least important axes.
The point of using MDS is that in some settings the input distances are not derived from existing coordinates, but from something else, which is non-geometric, for example, dissimilarity ratings made by people.
Your 3x3 dissimilarity matrix does not obey the triangle inequality needed in metric spaces (because d[1][0]+d[0][2]<d[1][2]) and can thus not be accurately embedded in a Euclidean space. Technically, the NaN values in the second dimension are due to the negative second eigenvalue of the modified dissimilarity matrix.