是否可以做到“无状态”? matlab编程/如何避免检查数据完整性?
我的日常工作流程是这样的:
- 获取原始数据(~50GB)
- 解析原始数据计时信息并根据计时信息构建原始数据结构(结构/对象)(什么事件发生在何时、以何种顺序、以什么方式发生)文件,同时发生的其他事件等...)
- 仅将原始数据的必要部分加载到从先前的计时信息中选择的结构/对象中(基本上这是一种子选择数据的方法)
- 对于每个原始数据块,计算/提取某些指标,例如信号的 RMS、事件,其中数据>阈值、d'/z 分数,并将它们与
- 给定先前计算的指标的结构/对象保存在一起,从不同数据通道加载相同时间片段的一些原始数据并比较某些内容等...
- 可视化结果 x, y , z
我有两种处理这种数据/工作流程的方法:
- 使用 struct()
- 使用对象
两种情况都有一定的优点/缺点:
struct:
- 可以动态添加属性/字段
- 每次将结构传递给函数时都必须检查结构的状态
- 继续重写某些函数,因为每次我稍微更改结构时,我 a) 往往会忘记它已经存在一个函数,或者 b) 我编写一个新版本来处理结构状态的特殊情况。
对象:
- 使用“get.property()”方法,我可以在函数/方法内访问属性之前检查属性的状态 ->允许进行数据一致性检查。
- 我始终知道哪些方法适用于我的对象,因为它们是对象定义的一部分。
- 每次添加新属性或方法时都需要
清除类
- 非常烦人!
现在我的问题是:其他人如何处理这种情况?你如何组织你的数据?在结构中?在物体中?你如何处理状态检查?有没有办法在 matlab 中进行“无状态”编程?
My day to day work flow is something like this:
- acquire raw data (~50GB)
- parse raw data timing-information and build raw data structure (struct / object) from timing-information (what event occurred when, in which order, in what file, what other events occurred at the same time, etc ...)
- load only the necessary parts of raw data into struct / object as selected from previous timing information (basically this is a way to sub-select data)
- for each raw data chunk, calculate / extract certain metrics like RMS of signal, events where data > threshold, d' / z-score, and save them with struct / object
- given the the previously calculated metrics, load some raw-data of same time episodes from different data channel and compare certain things, etc ...
- visualize results x, y, z
I have two ways of dealing with this kind of data / workflow:
- use struct()
- use objects
There are certain advantages / disadvantages to both cases:
struct:
- can add properties / fields on the fly
- have to check for state of struct every single time that I pass a struct to a function
- keep re-writing certain functions because every time that I change the struct slightly I a) tend to forget that a function already exists for it or b) I write a new version that handles a special case of the struct state.
objects:
- using 'get.property()' methods, I can check the state of a property before it get's accessed inside a function / method -> allows to do data consistency checks.
- I always know which methods work with my object, since they are part of the object definition.
- need to
clear classes
every time I add a new property or method - very annoying!
Now my question is: how do other people deal with this kind of situation? how do you organize your data? in structs? in objects? how do you handle state checks? is there a way to do 'stateless' programming in matlab?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我喜欢使用物体。您不需要在每次更改时都调用明确的类。删除“旧”对象的所有实例就足够了。
我经常继承的两个非常强大的附加功能是句柄和动态道具。
关于一致性检查 - 为什么在使用 set.property 时不进行一致性检查?
编辑 1:
使用数据库的简化类:
编辑 2:事件示例 - 观察者
和侦听器:
示例:
然后您可以在执行以下操作时进行观察:
a.data=rand(100,3) - 情节将立即改变。
编辑 3:一个简单的保存类
示例:
看看 `whos':
虽然您可以进行像
d = a.data-b
这样的计算 -a
在内存中只占用 60 个字节- 与约 8 MB 的b
相反。编辑4:经常改变功能的技巧。当您将逻辑放入外部命令中时,当您更改那里的函数定义时,matlab 不会抱怨。
I like to use objects. You don't need to call clear classes on every change. It is enough to delete all instances of the "old" object.
Two very powerful additions I inherit often are handle and dynamicprops.
About the consistency checks - why no do them when you use set.property?
Edit 1:
a simplified class that uses the database:
Edit 2: Example for Event - Observer
and a listener:
The example:
you can then observe when you do a:
a.data=rand(100,3)
- the plot will change immediatly.Edit 3: a simple saving class
Example:
look at `whos':
although you can do calculations like
d = a.data-b
-a
takes just 60 bytes in memory - as opposed to the ~8 MB ofb
.Edit 4: trick for often changing functions. When you put the logic in external commands matlab will not complain when you change the function definition there.