如何使用 Scipy.io.loadmat 将 Matlab mat 文件中的字符串元胞数组加载到 Python 列表或元组中
我是一位刚接触 Python 的 Matlab 用户。我想将 Matlab 中的字符串元胞数组写入 Mat 文件,并使用 Python(可能是 scipy.io.loadmat)将此 Mat 文件加载到某种类似的类型(例如字符串列表或字符串元组)中。但 loadmat 将内容读入数组,我不知道如何将其转换为列表。我尝试了“tolist”函数,但它没有按我的预期工作(我对Python数组或numpy数组的理解很差)。例如:
Matlab代码:
cell_of_strings = {'thank', 'you', 'very', 'much'};
save('my.mat', 'cell_of_strings');
Python代码:
matdata=loadmat('my.mat', chars_as_strings=1, matlab_compatible=1);
array_of_strings = matdata['cell_of_strings']
那么,变量array_of_strings是:
array([[[[u't' u'h' u'a' u'n' u'k']], [[u'y' u'o' u'u']],
[[u'v' u'e' u'r' u'y']], [[u'm' u'u' u'c' u'h']]]], dtype=object)
我不知道如何将这个array_of_strings转换成Python列表或元组,这样看起来
list_of_strings = ['thank', 'you', 'very', 'much'];
我对Python或numpy中的数组对象不熟悉。我们将非常感谢您的帮助。
I am a Matlab user new to Python. I would like to write a cell array of strings in Matlab to a Mat file, and load this Mat file using Python (maybe scipy.io.loadmat) into some similar type (e.g list of strings or tuple of strings). But loadmat read things into array and I am not sure how to convert it into a list. I tried the "tolist" function which does not work as I expected ( I have a poor understanding of Python array or numpy array). For example:
Matlab code:
cell_of_strings = {'thank', 'you', 'very', 'much'};
save('my.mat', 'cell_of_strings');
Python code:
matdata=loadmat('my.mat', chars_as_strings=1, matlab_compatible=1);
array_of_strings = matdata['cell_of_strings']
Then, the variable array_of_strings is:
array([[[[u't' u'h' u'a' u'n' u'k']], [[u'y' u'o' u'u']],
[[u'v' u'e' u'r' u'y']], [[u'm' u'u' u'c' u'h']]]], dtype=object)
I am not sure how to convert this array_of_strings into a Python list or tuple so that it looks like
list_of_strings = ['thank', 'you', 'very', 'much'];
I am not familiar with the array object in Python or numpy. Your help will be highly appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
你有没有尝试过这个:
Have your tried this:
这看起来像是列表理解的工作。重复您的示例,我在 MATLAB 中执行此操作:
我使用的是较新版本的 MATLAB,它默认以 HDF5 格式保存
.mat
文件。loadmat
无法读取 HDF5 文件,因此“-v7”标志是强制 MATLAB 保存到旧版本.mat
文件,该文件loadmat可以理解。
在Python中,我像你一样加载元胞数组:
打印
array_of_strings
给出:变量
array_of_strings
是一个(1,4) numpy对象数组,但里面嵌套了数组每个对象。例如,array_of_strings
的第一个元素是一个包含“thank”字母的 (1,5) 数组。也就是说,要获取第一个字母“t”,您必须执行以下操作:
由于我们正在处理嵌套数组,因此我们需要采用一些递归技术来提取数据,即嵌套
for
循环。但首先,我将向您展示如何提取第一个单词:这里我使用列表理解。基本上,我循环遍历 array_of_strings[0][0] 中的每个字母,并使用
''.join
方法将它们连接起来。string()
函数的作用是将unicode字符串转换为常规字符串。现在,要获取所需的列表字符串,我们只需循环遍历每个字母数组:
列表推导式需要一些时间来适应,但它们非常有用。希望这有帮助。
This looks like a job for list comprehension. Repeating your example, I did this in MATLAB:
I'm using a newer version of MATLAB, which saves
.mat
files in HDF5 format by default.loadmat
can't read HDF5 files, so the '-v7' flag is to force MATLAB to save to an older version.mat
file, whichloadmat
can understand.In Python, I loaded the cell array just like you did:
Printing
array_of_strings
gives:The variable
array_of_strings
is a (1,4) numpy object array but there are arrays nested within each object. For example, the first element ofarray_of_strings
is an (1,5) array containing the letters for 'thank'. That is,To get at the first letter 't', you have to do something like:
Since we are dealing with nested arrays, we need to employ some recursive technique to extract the data, i.e. nested
for
loops. But first, I'll show you how to extract the first word:Here I am using a list comprehension. Basically, I am looping through each letter in array_of_strings[0][0] and concatenating them using the
''.join
method. Thestring()
function is to convert the unicode strings into regular strings.Now, to get the list strings you want, we just need to loop through each array of letters:
List comprehensions take some getting used to, but they are extremely useful. Hope this helps.