我可以在Python中使用字典作为矩阵吗?
我只是 python 的初学者。最近我正在学习使用词典,但我的知识仍然有限。我脑子里突然冒出这个想法,但我不确定它在 python 中是否可行。
我有 3 个文档,如下所示:
DOCNO= 5
nanofluids :0.6841
introduction:0.2525
module :0.0000
to :0.0000
learning :0.0000
DOCID= 1
nanofluids :0.0000
introduction:0.2372
module :0.0000
to :0.0000
learning :0.1185
DOCNO= 12
nanofluids :0.0000
introduction:0.0000
module :0.5647
to :0.0000
learning :0.2084
我知道如何在字典中存储单个值。例如:
data={5: 0.67884, 1:0.1567, 12:3455}
但我现在想做的是存储一个具有相应文档编号的数组,如下所示:
import array
data={ 5:array([0.6841,0.2525,0.0000.0000,0.0000]), 1:array([0.0000,0.2372,0.0000,0.0000,0.1185]), 12:array([0.0000,0.0000,0.5647,0.0000,0.2084])}
* My python v2.6.5 似乎不允许我这样做。*
如果假设上述操作有效,我想执行点积或矩阵积来查找文档对之间的相似性。我的想法是将数组排列为 3x5 矩阵,并乘以它的转置,即 5x3。这将返回一个 3x3 矩阵,它告诉我两个文档之间的关系。例如:
[ 5:[0.6841,0.2525,0.0000,0.0000,0.0000],
1:[0.0000, 0.2372,0.0000,0.0000,0.1185],
12:[0.0000,0.0000,0.5647,0.0000,0.2084] ]
乘以它的转置(我不知道该怎么做),结果将是 3x3 矩阵,对应于“DOCNO”和“DOCNO”。
底线是我需要能够检索 DOCNO。例如 (5,1) 显示文档 5 和 1 之间的关系。或者 (1,12) 显示文档 1 和 12 之间的关系。我不确定这在 python 中是否可行,但其他类似的解决方案将不胜感激。感谢您抽出时间。
I am just a beginner in python. Recently i am learning to use dictionaries but my knowledge in it is still limited. I have this idea popping out from my head but i am not sure whether it is workable in python.
I have 3 document looks like this:
DOCNO= 5
nanofluids :0.6841
introduction:0.2525
module :0.0000
to :0.0000
learning :0.0000
DOCID= 1
nanofluids :0.0000
introduction:0.2372
module :0.0000
to :0.0000
learning :0.1185
DOCNO= 12
nanofluids :0.0000
introduction:0.0000
module :0.5647
to :0.0000
learning :0.2084
I know how to store a single value in dictionary. For example:
data={5: 0.67884, 1:0.1567, 12:3455}
But what i want to do now is storing an array with corresponding document number which looks like:
import array
data={ 5:array([0.6841,0.2525,0.0000.0000,0.0000]), 1:array([0.0000,0.2372,0.0000,0.0000,0.1185]), 12:array([0.0000,0.0000,0.5647,0.0000,0.2084])}
* My python v2.6.5 doesn't seem to let me do this.*
If assume that the above operation works, i want to perform dot product or matrix product to find the similarity between pairs of documents. My idea is to arrange the array in 3x5 matrix and multiply by its transpose which is 5x3. This will return a 3x3 matrix which tells me the relationship between two documents. for example:
[ 5:[0.6841,0.2525,0.0000,0.0000,0.0000],
1:[0.0000, 0.2372,0.0000,0.0000,0.1185],
12:[0.0000,0.0000,0.5647,0.0000,0.2084] ]
and multiply by its transpose( i am not sure how to do that) and the result will be 3x3 matrix that corresponded to "DOCNO" by "DOCNO".
Bottom line is i need to be able to retrieve the DOCNO. For example (5,1) shows the relationship between document 5 and 1. Or ( 1,12) shows the relationship between document 1 and 12. I am not sure whether this is possible in python but other similar resolution will be appreciated. Thanks for your time.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
首先,您应该查看数组的 Python 文档。您的示例代码存在三个问题:
您导入了数组模块,但没有导入数组类。试试这个:
从数组导入数组
您的列表中已经有
0.0000.0000
作为浮点数。array
有两个参数;类型代码和初始化值。将array([...])
调用更改为array('f', [...])
调用,它应该可以工作。但说实话,Python 没有很多内置的基本工具(您可以随时编写自己的工具)。如果您正在做矩阵代数,您可能应该使用 NumPy。
它可以处理 数组 和 矩阵,以及所有相关的转换。
First, you should look at the Python documentation for arrays. There are three things wrong with your sample code:
You've imported the array module, but not the array class. Try this:
from array import array
You've got
0.0000.0000
as a float in your list.array
takes two arguments; a typecode and the initialization values. Change yourarray([...])
calls toarray('f', [...])
calls, and it should work.But truth be told, Python doesn't have many basic tools for this built in (you can always write your own). If you're doing matrix algebra you should probably use NumPy.
It can handle both arrays and matrices, along with all the relevant transforms.
要修复您的数据分配,请尝试这样的操作:
这样或那样,我将使用 NumPy 进行其余的计算。
To fix your data assigment try something like this:
That way or another I would use NumPy for rest of calculations.