如何使用 Scipy.io.loadmat 将 Matlab mat 文件中的字符串元胞数组加载到 Python 列表或元组中

发布于 2024-10-14 15:31:36 字数 875 浏览 2 评论 0原文

我是一位刚接触 Python 的 Matlab 用户。我想将 Matlab 中的字符串元胞数组写入 Mat 文件,并使用 Python(可能是 scipy.io.loadmat)将此 Mat 文件加载到某种类似的类型(例如字符串列表或字符串元组)中。但 loadmat 将内容读入数组,我不知道如何将其转换为列表。我尝试了“tolist”函数,但它没有按我的预期工作(我对Python数组或numpy数组的理解很差)。例如:

Matlab代码:

cell_of_strings = {'thank',  'you', 'very', 'much'};
save('my.mat', 'cell_of_strings');

Python代码:

matdata=loadmat('my.mat', chars_as_strings=1, matlab_compatible=1);
array_of_strings = matdata['cell_of_strings']

那么,变量array_of_strings是:

array([[[[u't' u'h' u'a' u'n' u'k']], [[u'y' u'o' u'u']],
    [[u'v' u'e' u'r' u'y']], [[u'm' u'u' u'c' u'h']]]], dtype=object)

我不知道如何将这个array_of_strings转换成Python列表或元组,这样看起来

list_of_strings = ['thank',  'you', 'very', 'much'];

我对Python或numpy中的数组对象不熟悉。我们将非常感谢您的帮助。

I am a Matlab user new to Python. I would like to write a cell array of strings in Matlab to a Mat file, and load this Mat file using Python (maybe scipy.io.loadmat) into some similar type (e.g list of strings or tuple of strings). But loadmat read things into array and I am not sure how to convert it into a list. I tried the "tolist" function which does not work as I expected ( I have a poor understanding of Python array or numpy array). For example:

Matlab code:

cell_of_strings = {'thank',  'you', 'very', 'much'};
save('my.mat', 'cell_of_strings');

Python code:

matdata=loadmat('my.mat', chars_as_strings=1, matlab_compatible=1);
array_of_strings = matdata['cell_of_strings']

Then, the variable array_of_strings is:

array([[[[u't' u'h' u'a' u'n' u'k']], [[u'y' u'o' u'u']],
    [[u'v' u'e' u'r' u'y']], [[u'm' u'u' u'c' u'h']]]], dtype=object)

I am not sure how to convert this array_of_strings into a Python list or tuple so that it looks like

list_of_strings = ['thank',  'you', 'very', 'much'];

I am not familiar with the array object in Python or numpy. Your help will be highly appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

晚雾 2024-10-21 15:31:36

你有没有尝试过这个:

import scipy.io as si

a = si.loadmat('my.mat')
b = a['cell_of_strings']                # type(b) <type 'numpy.ndarray'>
list_of_strings  = b.tolist()           # type(list_of_strings ) <type 'list'>

print list_of_strings 
# output: [u'thank', u'you', u'very', u'much']

Have your tried this:

import scipy.io as si

a = si.loadmat('my.mat')
b = a['cell_of_strings']                # type(b) <type 'numpy.ndarray'>
list_of_strings  = b.tolist()           # type(list_of_strings ) <type 'list'>

print list_of_strings 
# output: [u'thank', u'you', u'very', u'much']
放手` 2024-10-21 15:31:36

这看起来像是列表理解的工作。重复您的示例,我在 MATLAB 中执行此操作:

cell_of_strings = {'thank',  'you', 'very', 'much'};
save('my.mat', 'cell_of_strings','-v7'); 

我使用的是较新版本的 MATLAB,它默认以 HDF5 格式保存 .mat 文件。 loadmat 无法读取 HDF5 文件,因此“-v7”标志是强制 MATLAB 保存到旧版本 .mat 文件,该文件 loadmat可以理解。

在Python中,我像你一样加载元胞数组:

import scipy.io as sio
matdata = sio.loadmat('%s/my.mat' %path, chars_as_strings=1, matlab_compatible=1);  
array_of_strings = matdata['cell_of_strings']

打印array_of_strings给出:

[[array([[u't', u'h', u'a', u'n', u'k']], 
          dtype='<U1')
      array([[u'y', u'o', u'u']], 
          dtype='<U1')
      array([[u'v', u'e', u'r', u'y']], 
          dtype='<U1')
      array([[u'm', u'u', u'c', u'h']], 
          dtype='<U1')]]

变量array_of_strings是一个(1,4) numpy对象数组,但里面嵌套了数组每个对象。例如,array_of_strings 的第一个元素是一个包含“thank”字母的 (1,5) 数组。也就是说,

array_of_strings[0,0]
array([[u't', u'h', u'a', u'n', u'k']], 
      dtype='<U1')

要获取第一个字母“t”,您必须执行以下操作:

array_of_strings[0,0][0,0]
u't'

由于我们正在处理嵌套数组,因此我们需要采用一些递归技术来提取数据,即嵌套 for循环。但首先,我将向您展示如何提取第一个单词:

first_word = [str(''.join(letter)) for letter in array_of_strings[0][0]]
first_word
['thank']

这里我使用列表理解。基本上,我循环遍历 array_of_strings[0][0] 中的每个字母,并使用 ''.join 方法将它们连接起来。 string()函数的作用是将unicode字符串转换为常规字符串。

现在,要获取所需的列表字符串,我们只需循环遍历每个字母数组:

words = [str(''.join(letter)) for letter_array in array_of_strings[0] for letter in letter_array]
words
['thank', 'you', 'very', 'much']

列表推导式需要一些时间来适应,但它们非常有用。希望这有帮助。

This looks like a job for list comprehension. Repeating your example, I did this in MATLAB:

cell_of_strings = {'thank',  'you', 'very', 'much'};
save('my.mat', 'cell_of_strings','-v7'); 

I'm using a newer version of MATLAB, which saves .mat files in HDF5 format by default. loadmat can't read HDF5 files, so the '-v7' flag is to force MATLAB to save to an older version .mat file, which loadmat can understand.

In Python, I loaded the cell array just like you did:

import scipy.io as sio
matdata = sio.loadmat('%s/my.mat' %path, chars_as_strings=1, matlab_compatible=1);  
array_of_strings = matdata['cell_of_strings']

Printing array_of_strings gives:

[[array([[u't', u'h', u'a', u'n', u'k']], 
          dtype='<U1')
      array([[u'y', u'o', u'u']], 
          dtype='<U1')
      array([[u'v', u'e', u'r', u'y']], 
          dtype='<U1')
      array([[u'm', u'u', u'c', u'h']], 
          dtype='<U1')]]

The variable array_of_strings is a (1,4) numpy object array but there are arrays nested within each object. For example, the first element of array_of_strings is an (1,5) array containing the letters for 'thank'. That is,

array_of_strings[0,0]
array([[u't', u'h', u'a', u'n', u'k']], 
      dtype='<U1')

To get at the first letter 't', you have to do something like:

array_of_strings[0,0][0,0]
u't'

Since we are dealing with nested arrays, we need to employ some recursive technique to extract the data, i.e. nested for loops. But first, I'll show you how to extract the first word:

first_word = [str(''.join(letter)) for letter in array_of_strings[0][0]]
first_word
['thank']

Here I am using a list comprehension. Basically, I am looping through each letter in array_of_strings[0][0] and concatenating them using the ''.join method. The string() function is to convert the unicode strings into regular strings.

Now, to get the list strings you want, we just need to loop through each array of letters:

words = [str(''.join(letter)) for letter_array in array_of_strings[0] for letter in letter_array]
words
['thank', 'you', 'very', 'much']

List comprehensions take some getting used to, but they are extremely useful. Hope this helps.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文