scipy.sparse.csr.csr_matrix 的 max 和 argmax 的含义

发布于 2025-01-19 20:11:34 字数 2638 浏览 2 评论 0原文

我有这个tf-idf矩阵

type(dt)  # output: scipy.sparse.csr.csr_matrix
pd.DataFrame(dt.toarray())

# output:

        0          1            2           3        4          5
0   0.000000    0.000000    0.500000    0.500000    0.5    0.50000
1   0.707107    0.707107    0.000000    0.000000    0.0    0.00000
2   0.000000    0.000000    0.000000    0.000000    0.0    0.00000
3   0.000000    0.000000    0.707107    0.707107    0.0    0.00000
4   0.000000    0.000000    0.000000    0.000000    0.0    0.00000
5   0.000000    0.000000    0.000000    0.000000    0.0    0.00000
6   0.577350    0.577350    0.000000    0.000000    0.0    0.57735
7   0.000000    0.000000    0.000000    0.000000    0.0    0.00000
8   0.000000    0.000000    0.000000    0.000000    0.0    0.00000
9   0.000000    0.000000    0.000000    0.000000    1.0    0.00000

我运行了此代码,以了解maxargmax的含义,

test = np.dot(dt, np.transpose(dt))
test[test > 0.9999] = np.nan
ind = np.unravel_index(np.argmax(test), test.shape)
print('shape of test', test.shape)
print(f'max of test: {test.max()}')
print(f'argmax of test: {np.argmax(test)}')
print('location of max value:', ind)
print('value at the location:', test[ind])
print(pd.DataFrame(test.toarray()))

该矩阵产生了此输出

shape of test (10, 10)
max of test: nan
argmax of test: 1
location of max value: (0, 1)
value at the location: 0.0
          0         1    2         3    4    5         6    7    8    9
0       NaN  0.000000  0.0  0.707107  0.0  0.0  0.288675  0.0  0.0  0.5
1  0.000000       NaN  0.0  0.000000  0.0  0.0  0.816497  0.0  0.0  0.0
2  0.000000  0.000000  0.0  0.000000  0.0  0.0  0.000000  0.0  0.0  0.0
3  0.707107  0.000000  0.0       NaN  0.0  0.0  0.000000  0.0  0.0  0.0
4  0.000000  0.000000  0.0  0.000000  0.0  0.0  0.000000  0.0  0.0  0.0
5  0.000000  0.000000  0.0  0.000000  0.0  0.0  0.000000  0.0  0.0  0.0
6  0.288675  0.816497  0.0  0.000000  0.0  0.0       NaN  0.0  0.0  0.0
7  0.000000  0.000000  0.0  0.000000  0.0  0.0  0.000000  0.0  0.0  0.0
8  0.000000  0.000000  0.0  0.000000  0.0  0.0  0.000000  0.0  0.0  0.0
9  0.500000  0.000000  0.0  0.000000  0.0  0.0  0.000000  0.0  0.0  NaN

,但我可以't了解测试最大的输出的含义:NAN测试的Argmax:1最大值的位置:(0,1)。我认为测试的最大gragmax应该是 0.816497 而不是NAN 1 分别;最大值的位置应为(6,1)(1,6),其中显示了0.816497的位置。

有人可以解释测试最大的代码test最大值的位置做了什么?

I have this tf-idf matrix

type(dt)  # output: scipy.sparse.csr.csr_matrix
pd.DataFrame(dt.toarray())

# output:

        0          1            2           3        4          5
0   0.000000    0.000000    0.500000    0.500000    0.5    0.50000
1   0.707107    0.707107    0.000000    0.000000    0.0    0.00000
2   0.000000    0.000000    0.000000    0.000000    0.0    0.00000
3   0.000000    0.000000    0.707107    0.707107    0.0    0.00000
4   0.000000    0.000000    0.000000    0.000000    0.0    0.00000
5   0.000000    0.000000    0.000000    0.000000    0.0    0.00000
6   0.577350    0.577350    0.000000    0.000000    0.0    0.57735
7   0.000000    0.000000    0.000000    0.000000    0.0    0.00000
8   0.000000    0.000000    0.000000    0.000000    0.0    0.00000
9   0.000000    0.000000    0.000000    0.000000    1.0    0.00000

I ran this code to understand the meaning of max and argmax of the matrix

test = np.dot(dt, np.transpose(dt))
test[test > 0.9999] = np.nan
ind = np.unravel_index(np.argmax(test), test.shape)
print('shape of test', test.shape)
print(f'max of test: {test.max()}')
print(f'argmax of test: {np.argmax(test)}')
print('location of max value:', ind)
print('value at the location:', test[ind])
print(pd.DataFrame(test.toarray()))

Which produced this output

shape of test (10, 10)
max of test: nan
argmax of test: 1
location of max value: (0, 1)
value at the location: 0.0
          0         1    2         3    4    5         6    7    8    9
0       NaN  0.000000  0.0  0.707107  0.0  0.0  0.288675  0.0  0.0  0.5
1  0.000000       NaN  0.0  0.000000  0.0  0.0  0.816497  0.0  0.0  0.0
2  0.000000  0.000000  0.0  0.000000  0.0  0.0  0.000000  0.0  0.0  0.0
3  0.707107  0.000000  0.0       NaN  0.0  0.0  0.000000  0.0  0.0  0.0
4  0.000000  0.000000  0.0  0.000000  0.0  0.0  0.000000  0.0  0.0  0.0
5  0.000000  0.000000  0.0  0.000000  0.0  0.0  0.000000  0.0  0.0  0.0
6  0.288675  0.816497  0.0  0.000000  0.0  0.0       NaN  0.0  0.0  0.0
7  0.000000  0.000000  0.0  0.000000  0.0  0.0  0.000000  0.0  0.0  0.0
8  0.000000  0.000000  0.0  0.000000  0.0  0.0  0.000000  0.0  0.0  0.0
9  0.500000  0.000000  0.0  0.000000  0.0  0.0  0.000000  0.0  0.0  NaN

But I couldn't understand the meaning of the output for max of test: nan, argmax of test: 1 and location of max value: (0, 1). I thought the max of test and argmax should be 0.816497 instead of nan and 1 respectively; and the location of the max value should be (6, 1) or (1, 6) where the value 0.816497 was displayed.

Could someone please explain what the code for max of test, argmax of test and location of max value did?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

维持三分热 2025-01-26 20:11:34

如果ndarray.max遇到“ nan”,那就是它返回的。这是文档中描述的。您应该查看np.nanmax

np.argmax返回最大值的索引。

If ndarray.max encounters a "nan", that's what it returns. That's described in the documentation. You should look at np.nanmax.

np.argmax returns the index of the maximum value.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文