Matlab：如何处理异常数据文件

发布于 2024-09-16 20:19:36 字数 800 浏览 8 评论 0原文

我正在尝试将大量文件导入Matlab进行处理。典型的文件如下所示：

    mass      intensity
 350.85777         238
 350.89252        3094
 350.98688        2762
 351.87899         468
 352.17712         569
 352.28449         426
Some text and numbers here, describing the experimental setup, eg  
Scan 3763 @ 81.95, contains 1000 points:

两列中的数字由 8 个空格分隔。然而，有时实验会出错，机器会生成如下所示的数据文件：

mass      intensity

Some text and numbers here, describing the experimental setup, eg  
Scan 3763 @ 81.95, contains 1000 points:

我发现使用带有单个标题行的空格分隔文件，即

importdata(path_to_file,' ',  1);

最适合普通文件。然而，它对所有异常文件完全失败。解决这个问题最简单的方法是什么？我应该坚持使用 importdata （已经尝试了所有可能的设置，但它不起作用）还是应该尝试编写自己的解析器？理想情况下，我希望在 Nx2 矩阵中获取正常文件的这些值，并在异常文件的 [0 0] 中获取这些值。

谢谢。

原文

I am trying to import a large number of files into Matlab for processing. A typical file would look like this:

    mass      intensity
 350.85777         238
 350.89252        3094
 350.98688        2762
 351.87899         468
 352.17712         569
 352.28449         426
Some text and numbers here, describing the experimental setup, eg  
Scan 3763 @ 81.95, contains 1000 points:

The numbers in the two columns are separated by 8 spaces. However, sometimes the experiment will go wrong and the machine will produce a datafile like this one:

mass      intensity

Some text and numbers here, describing the experimental setup, eg  
Scan 3763 @ 81.95, contains 1000 points:

I found that using space-separated files with a single header row, ie

importdata(path_to_file,' ',  1);

works best for the normal files. However, it totally fails on all the abnormal files. What would the easiest way to fix this be? Should I stick with importdata (already tried all possible settings, it just doesn't work) or should I try writing my own parser? Ideally, I would like to get those values in a Nx2 matrix for normal files and [0 0] for abnormal files.

Thanks.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

他是夢罘是命 2024-09-23 20:19:36

我认为您不需要创建自己的解析器，这也不是那么不正常。使用 textscan 是您的最佳选择。

fid = fopen('input.txt', 'rt');
data = textscan(fid, '%f %u', 'Headerlines', 1);
fclose(fid);

mass = data{1};
intensity = data{2};

产量：

mass =
  350.8578
  350.8925
  350.9869
  351.8790
  352.1771
  352.2845

intensity =
         238
        3094
        2762
         468
         569
         426

对于您的第一个文件和：

    mass =
       Empty matrix: 0-by-1

    intensity =
       Empty matrix: 0-by-1

对于您的空文件。

默认情况下，文本扫描读取空格作为分隔符，并且它只读取您告诉它的内容，直到它无法再这样做为止；因此它会忽略文件中的最后几行。如果您想选取这些附加字段，您还可以在此之后运行第二次文本扫描：

fid = fopen('input.txt', 'rt');
data = textscan(fid, '%f %u', 'Headerlines', 1);

mass = data{1};
intensity = data{2};

data = textscan(fid, '%*s %u %*c %f %*c %*s %u %*s', 'Headerlines', 1);

scan = data{1};
level = data{2};
points = data{3};

fclose(fid);

连同质量和强度数据一起给出：

    scan =
            3763

    level =
       81.9500

    points =
            1000

I don't think you need to create your own parser, nor is this all that abnormal. Using textscan is your best option here.

fid = fopen('input.txt', 'rt');
data = textscan(fid, '%f %u', 'Headerlines', 1);
fclose(fid);

mass = data{1};
intensity = data{2};

Yields:

mass =
  350.8578
  350.8925
  350.9869
  351.8790
  352.1771
  352.2845

intensity =
         238
        3094
        2762
         468
         569
         426

For your 1st file and:

    mass =
       Empty matrix: 0-by-1

    intensity =
       Empty matrix: 0-by-1

For your empty one.

By default, text scan reads whitespace as a delimiter, and it only reads what you tell it to until it can no longer do so; thus it ignores the final lines in your file. You can also run a second textscan after this one if you want to pick up those additional fields:

fid = fopen('input.txt', 'rt');
data = textscan(fid, '%f %u', 'Headerlines', 1);

mass = data{1};
intensity = data{2};

data = textscan(fid, '%*s %u %*c %f %*c %*s %u %*s', 'Headerlines', 1);

scan = data{1};
level = data{2};
points = data{3};

fclose(fid);

Along with your mass and intensity data gives:

    scan =
            3763

    level =
       81.9500

    points =
            1000

回复收藏 0 原文

深海夜未眠 2024-09-23 20:19:36

“异常文件完全失败”是什么意思？

您可以使用以下命令检查 importdata 是否找到任何数据

>> imported = importdata(path_to_file,' ',  1);
>> isfield(imported, 'data')

what do you mean 'totally failes on abnormal files'?

you can check if importdata finds any data using e.g.

>> imported = importdata(path_to_file,' ',  1);
>> isfield(imported, 'data')

回复收藏 0 原文

~没有更多了~

关于作者

自在安然

暂无简介

文章

29 人气

关注发私信

友情链接

文江博客

Matlab：如何处理异常数据文件

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

微信用户

夜夜流光相皎洁

零度℉

百度③文鱼

qq_O3Ao6frw

Wugswg

友情链接

Matlab：如何处理异常数据文件

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

微信用户

夜夜流光相皎洁

零度℉

百度③文鱼

qq_O3Ao6frw

Wugswg

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。