scipy.io.loadmat 嵌套结构(即字典)

发布于 11-28 17:17 字数 1035 浏览 0 评论 0原文

使用给定的例程(如何使用 scipy 加载 Matlab .mat 文件),我无法访问更深的嵌套结构以将它们恢复到字典中

为了更详细地介绍我遇到的问题,我给出了以下玩具示例:

load scipy.io as spio
a = {'b':{'c':{'d': 3}}}
# my dictionary: a['b']['c']['d'] = 3
spio.savemat('xy.mat',a)

现在我想将 mat 文件读回到 python 中。我尝试了以下操作:

vig=spio.loadmat('xy.mat',squeeze_me=True)

如果我现在想访问我得到的字段:

>> vig['b']
array(((array(3),),), dtype=[('c', '|O8')])
>> vig['b']['c']
array(array((3,), dtype=[('d', '|O8')]), dtype=object)
>> vig['b']['c']['d']
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)

/<ipython console> in <module>()

ValueError: field named d not found.

但是,通过使用选项 struct_as_record=False 可以访问该字段:

v=spio.loadmat('xy.mat',squeeze_me=True,struct_as_record=False)

现在可以通过以下方式访问它:

>> v['b'].c.d
array(3)

Using the given routines (how to load Matlab .mat files with scipy), I could not access deeper nested structures to recover them into dictionaries

To present the problem I run into in more detail, I give the following toy example:

load scipy.io as spio
a = {'b':{'c':{'d': 3}}}
# my dictionary: a['b']['c']['d'] = 3
spio.savemat('xy.mat',a)

Now I want to read the mat-File back into python. I tried the following:

vig=spio.loadmat('xy.mat',squeeze_me=True)

If I now want to access the fields I get:

>> vig['b']
array(((array(3),),), dtype=[('c', '|O8')])
>> vig['b']['c']
array(array((3,), dtype=[('d', '|O8')]), dtype=object)
>> vig['b']['c']['d']
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)

/<ipython console> in <module>()

ValueError: field named d not found.

However, by using the option struct_as_record=False the field could be accessed:

v=spio.loadmat('xy.mat',squeeze_me=True,struct_as_record=False)

Now it was possible to access it by

>> v['b'].c.d
array(3)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

初懵2024-12-05 17:17:50

以下是重建字典的函数,只需使用此 loadmat 而不是 scipy.io 的 loadmat:

import scipy.io as spio

def loadmat(filename):
    '''
    this function should be called instead of direct spio.loadmat
    as it cures the problem of not properly recovering python dictionaries
    from mat files. It calls the function check keys to cure all entries
    which are still mat-objects
    '''
    data = spio.loadmat(filename, struct_as_record=False, squeeze_me=True)
    return _check_keys(data)

def _check_keys(dict):
    '''
    checks if entries in dictionary are mat-objects. If yes
    todict is called to change them to nested dictionaries
    '''
    for key in dict:
        if isinstance(dict[key], spio.matlab.mio5_params.mat_struct):
            dict[key] = _todict(dict[key])
    return dict        

def _todict(matobj):
    '''
    A recursive function which constructs from matobjects nested dictionaries
    '''
    dict = {}
    for strg in matobj._fieldnames:
        elem = matobj.__dict__[strg]
        if isinstance(elem, spio.matlab.mio5_params.mat_struct):
            dict[strg] = _todict(elem)
        else:
            dict[strg] = elem
    return dict

Here are the functions, which reconstructs the dictionaries just use this loadmat instead of scipy.io's loadmat:

import scipy.io as spio

def loadmat(filename):
    '''
    this function should be called instead of direct spio.loadmat
    as it cures the problem of not properly recovering python dictionaries
    from mat files. It calls the function check keys to cure all entries
    which are still mat-objects
    '''
    data = spio.loadmat(filename, struct_as_record=False, squeeze_me=True)
    return _check_keys(data)

def _check_keys(dict):
    '''
    checks if entries in dictionary are mat-objects. If yes
    todict is called to change them to nested dictionaries
    '''
    for key in dict:
        if isinstance(dict[key], spio.matlab.mio5_params.mat_struct):
            dict[key] = _todict(dict[key])
    return dict        

def _todict(matobj):
    '''
    A recursive function which constructs from matobjects nested dictionaries
    '''
    dict = {}
    for strg in matobj._fieldnames:
        elem = matobj.__dict__[strg]
        if isinstance(elem, spio.matlab.mio5_params.mat_struct):
            dict[strg] = _todict(elem)
        else:
            dict[strg] = elem
    return dict
不如归去2024-12-05 17:17:50

只是对合并答案的增强,不幸的是,如果它到达对象的元胞数组,它将停止递归。以下版本将改为创建它们的列表,并在可能的情况下继续递归到元胞数组元素中。

import scipy.io as spio
import numpy as np


def loadmat(filename):
    '''
    this function should be called instead of direct spio.loadmat
    as it cures the problem of not properly recovering python dictionaries
    from mat files. It calls the function check keys to cure all entries
    which are still mat-objects
    '''
    def _check_keys(d):
        '''
        checks if entries in dictionary are mat-objects. If yes
        todict is called to change them to nested dictionaries
        '''
        for key in d:
            if isinstance(d[key], spio.matlab.mat_struct):
                d[key] = _todict(d[key])
        return d

    def _todict(matobj):
        '''
        A recursive function which constructs from matobjects nested dictionaries
        '''
        d = {}
        for strg in matobj._fieldnames:
            elem = matobj.__dict__[strg]
            if isinstance(elem, spio.matlab.mat_struct):
                d[strg] = _todict(elem)
            elif isinstance(elem, np.ndarray):
                d[strg] = _tolist(elem)
            else:
                d[strg] = elem
        return d

    def _tolist(ndarray):
        '''
        A recursive function which constructs lists from cellarrays
        (which are loaded as numpy ndarrays), recursing into the elements
        if they contain matobjects.
        '''
        elem_list = []
        for sub_elem in ndarray:
            if isinstance(sub_elem, spio.matlab.mat_struct):
                elem_list.append(_todict(sub_elem))
            elif isinstance(sub_elem, np.ndarray):
                elem_list.append(_tolist(sub_elem))
            else:
                elem_list.append(sub_elem)
        return elem_list
    data = spio.loadmat(filename, struct_as_record=False, squeeze_me=True)
    return _check_keys(data)

Just an enhancement to mergen's answer, which unfortunately will stop recursing if it reaches a cell array of objects. The following version will make lists of them instead, and continuing the recursion into the cell array elements if possible.

import scipy.io as spio
import numpy as np


def loadmat(filename):
    '''
    this function should be called instead of direct spio.loadmat
    as it cures the problem of not properly recovering python dictionaries
    from mat files. It calls the function check keys to cure all entries
    which are still mat-objects
    '''
    def _check_keys(d):
        '''
        checks if entries in dictionary are mat-objects. If yes
        todict is called to change them to nested dictionaries
        '''
        for key in d:
            if isinstance(d[key], spio.matlab.mat_struct):
                d[key] = _todict(d[key])
        return d

    def _todict(matobj):
        '''
        A recursive function which constructs from matobjects nested dictionaries
        '''
        d = {}
        for strg in matobj._fieldnames:
            elem = matobj.__dict__[strg]
            if isinstance(elem, spio.matlab.mat_struct):
                d[strg] = _todict(elem)
            elif isinstance(elem, np.ndarray):
                d[strg] = _tolist(elem)
            else:
                d[strg] = elem
        return d

    def _tolist(ndarray):
        '''
        A recursive function which constructs lists from cellarrays
        (which are loaded as numpy ndarrays), recursing into the elements
        if they contain matobjects.
        '''
        elem_list = []
        for sub_elem in ndarray:
            if isinstance(sub_elem, spio.matlab.mat_struct):
                elem_list.append(_todict(sub_elem))
            elif isinstance(sub_elem, np.ndarray):
                elem_list.append(_tolist(sub_elem))
            else:
                elem_list.append(sub_elem)
        return elem_list
    data = spio.loadmat(filename, struct_as_record=False, squeeze_me=True)
    return _check_keys(data)
无语#2024-12-05 17:17:50

scipy >= 1.5.0< /a> 此功能现在使用 simplify_cells 参数内置。

from scipy.io import loadmat

mat_dict = loadmat(file_name, simplify_cells=True)

As of scipy >= 1.5.0 this functionality now comes built-in using the simplify_cells argument.

from scipy.io import loadmat

mat_dict = loadmat(file_name, simplify_cells=True)
你是年少的欢喜2024-12-05 17:17:50

我在 scipy 邮件列表上得到了建议 (https://mail.python.org/pipermail/ scipy-user/)表明还有两种方法可以访问此数据。

这有效:

import scipy.io as spio
vig=spio.loadmat('xy.mat')
print vig['b'][0, 0]['c'][0, 0]['d'][0, 0]

在我的机器上输出:
3

这种访问的原因:“由于历史原因,在Matlab中,一切都至少是一个二维数组,甚至是标量。”
因此 scipy.io.loadmat 默认情况下模仿 Matlab 行为。

I was advised on the scipy mailing list (https://mail.python.org/pipermail/scipy-user/) that there are two more ways to access this data.

This works:

import scipy.io as spio
vig=spio.loadmat('xy.mat')
print vig['b'][0, 0]['c'][0, 0]['d'][0, 0]

Output on my machine:
3

The reason for this kind of access: "For historic reasons, in Matlab everything is at least a 2D array, even scalars."
So scipy.io.loadmat mimics Matlab behavior per default.

撧情箌佬2024-12-05 17:17:50

找到了一种解决方案,可以通过以下方式访问“scipy.io.matlab.mio5_params.mat_struct 对象”的内容:

v['b'].__dict__['c'].__dict__['d']

Found a solution, one can access the content of the "scipy.io.matlab.mio5_params.mat_struct object" can be investigated via:

v['b'].__dict__['c'].__dict__['d']
℉絮湮2024-12-05 17:17:50

另一种有效的方法:

import scipy.io as spio
vig=spio.loadmat('xy.mat',squeeze_me=True)
print vig['b']['c'].item()['d']

输出:

3

我也在 scipy 邮件列表上学到了这个方法。我当然不明白(还)为什么必须添加 '.item()' ,并且:

print vig['b']['c']['d']

会抛出错误:

IndexError: only integers, slices (:), ellipsis (< code>...)、numpy.newaxis (None) 和整数或布尔数组是有效索引,

但当我知道时我会回来补充解释。 numpy.ndarray.item 的解释(来自 thenumpy 参考):
将数组的元素复制到标准 Python 标量并返回它。

(请注意,这个答案与 hpaulj 对最初问题的评论基本相同,但我觉得该评论不够“可见”或不够理解。当我为第一个问题搜索解决方案时,我当然没有注意到它时间,几周前)。

Another method that works:

import scipy.io as spio
vig=spio.loadmat('xy.mat',squeeze_me=True)
print vig['b']['c'].item()['d']

Output:

3

I learned this method on the scipy mailing list, too. I certainly don't understand (yet) why '.item()' has to be added in, and:

print vig['b']['c']['d']

will throw an error instead:

IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices

but I'll be back to supplement the explanation when I know it. Explanation of numpy.ndarray.item (from thenumpy reference):
Copy an element of an array to a standard Python scalar and return it.

(Please notice that this answer is basically the same as the comment of hpaulj to the initial question, but I felt that the comment is not 'visible' or understandable enough. I certainly did not notice it when I searched for a solution for the first time, some weeks ago).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文