Matlab:如何处理异常数据文件

发布于 2024-09-16 20:19:36 字数 800 浏览 1 评论 0原文

我正在尝试将大量文件导入Matlab进行处理。典型的文件如下所示:

    mass      intensity
 350.85777         238
 350.89252        3094
 350.98688        2762
 351.87899         468
 352.17712         569
 352.28449         426
Some text and numbers here, describing the experimental setup, eg  
Scan 3763 @ 81.95, contains 1000 points:

两列中的数字由 8 个空格分隔。然而,有时实验会出错,机器会生成如下所示的数据文件:

mass      intensity

Some text and numbers here, describing the experimental setup, eg  
Scan 3763 @ 81.95, contains 1000 points:

我发现使用带有单个标题行的空格分隔文件,即

importdata(path_to_file,' ',  1);

最适合普通文件。然而,它对所有异常文件完全失败。解决这个问题最简单的方法是什么?我应该坚持使用 importdata (已经尝试了所有可能的设置,但它不起作用)还是应该尝试编写自己的解析器?理想情况下,我希望在 Nx2 矩阵中获取正常文件的这些值,并在异常文件的 [0 0] 中获取这些值。

谢谢。

I am trying to import a large number of files into Matlab for processing. A typical file would look like this:

    mass      intensity
 350.85777         238
 350.89252        3094
 350.98688        2762
 351.87899         468
 352.17712         569
 352.28449         426
Some text and numbers here, describing the experimental setup, eg  
Scan 3763 @ 81.95, contains 1000 points:

The numbers in the two columns are separated by 8 spaces. However, sometimes the experiment will go wrong and the machine will produce a datafile like this one:

mass      intensity

Some text and numbers here, describing the experimental setup, eg  
Scan 3763 @ 81.95, contains 1000 points:

I found that using space-separated files with a single header row, ie

importdata(path_to_file,' ',  1);

works best for the normal files. However, it totally fails on all the abnormal files. What would the easiest way to fix this be? Should I stick with importdata (already tried all possible settings, it just doesn't work) or should I try writing my own parser? Ideally, I would like to get those values in a Nx2 matrix for normal files and [0 0] for abnormal files.

Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

他是夢罘是命 2024-09-23 20:19:36

我认为您不需要创建自己的解析器,这也不是那么不正常。使用 textscan 是您的最佳选择。

fid = fopen('input.txt', 'rt');
data = textscan(fid, '%f %u', 'Headerlines', 1);
fclose(fid);

mass = data{1};
intensity = data{2};

产量:

mass =
  350.8578
  350.8925
  350.9869
  351.8790
  352.1771
  352.2845

intensity =
         238
        3094
        2762
         468
         569
         426

对于您的第一个文件和:

    mass =
       Empty matrix: 0-by-1

    intensity =
       Empty matrix: 0-by-1

对于您的空文件。

默认情况下,文本扫描读取空格作为分隔符,并且它只读取您告诉它的内容,直到它无法再这样做为止;因此它会忽略文件中的最后几行。如果您想选取这些附加字段,您还可以在此之后运行第二次文本扫描:

fid = fopen('input.txt', 'rt');
data = textscan(fid, '%f %u', 'Headerlines', 1);

mass = data{1};
intensity = data{2};

data = textscan(fid, '%*s %u %*c %f %*c %*s %u %*s', 'Headerlines', 1);

scan = data{1};
level = data{2};
points = data{3};

fclose(fid);

连同质量和强度数据一起给出:

    scan =
            3763

    level =
       81.9500

    points =
            1000

I don't think you need to create your own parser, nor is this all that abnormal. Using textscan is your best option here.

fid = fopen('input.txt', 'rt');
data = textscan(fid, '%f %u', 'Headerlines', 1);
fclose(fid);

mass = data{1};
intensity = data{2};

Yields:

mass =
  350.8578
  350.8925
  350.9869
  351.8790
  352.1771
  352.2845

intensity =
         238
        3094
        2762
         468
         569
         426

For your 1st file and:

    mass =
       Empty matrix: 0-by-1

    intensity =
       Empty matrix: 0-by-1

For your empty one.

By default, text scan reads whitespace as a delimiter, and it only reads what you tell it to until it can no longer do so; thus it ignores the final lines in your file. You can also run a second textscan after this one if you want to pick up those additional fields:

fid = fopen('input.txt', 'rt');
data = textscan(fid, '%f %u', 'Headerlines', 1);

mass = data{1};
intensity = data{2};

data = textscan(fid, '%*s %u %*c %f %*c %*s %u %*s', 'Headerlines', 1);

scan = data{1};
level = data{2};
points = data{3};

fclose(fid);

Along with your mass and intensity data gives:

    scan =
            3763

    level =
       81.9500

    points =
            1000
深海夜未眠 2024-09-23 20:19:36

“异常文件完全失败”是什么意思?

您可以使用以下命令检查 importdata 是否找到任何数据

>> imported = importdata(path_to_file,' ',  1);
>> isfield(imported, 'data')

what do you mean 'totally failes on abnormal files'?

you can check if importdata finds any data using e.g.

>> imported = importdata(path_to_file,' ',  1);
>> isfield(imported, 'data')
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文