从文本文件中检索特定数据并将其保存在 MATLAB 中的新文件中

发布于 2024-12-24 20:34:12 字数 1441 浏览 3 评论 0原文

我正在尝试从 txt 文件中检索特定数据,并希望将其保存在新的 txt 文件中。

以下是我想从文本文件中检索的数据示例:

//
FORMAT CDDF1.0
DOMAIN 1cuk003
VERSION   3.1.0
VERDATE   20-Jan-2007
NAME   Ruva protein. Chain: null. Engineered: yes
SOURCE Escherichia coli. Strain: 12 bl21 (de3).  Expressed in: escherichia co
SOURCE li.
CATHCODE  1.10.8.10
CLASS  Mainly Alpha
ARCH   Orthogonal Bundle
TOPOL  Helicase, Ruva Protein; domain 3 
HOMOL  DNA helicase RuvA subunit, C-terminal domain
DLENGTH   48
DSEQH  >pdb|1cuk003
DSEQS  TDDAEQEAVARLVALGYKPQEASRMVSKIARPDASSETLIREALRAAL
NSEGMENTS 1
SEGMENT   1cuk003:1:1
SRANGE START=156  STOP=203
SLENGTH   48
SSEQH  >pdb|1cuk003:1:1
SSEQS  TDDAEQEAVARLVALGYKPQEASRMVSKIARPDASSETLIREALRAAL
ENDSEG
//

从这些详细信息中,我尝试检索“DOMAIN,SRANGE START AND STOP”,但我只想选择信息,而不是文件名的标题。例如,DOMAIN 是 1cuk003,我只想从中选择“1cuk003”

我需要将这些数据存储在数组中吗?或者有什么其他方法可以用来解决这个问题。另外,我还有超过 10,000 个具有不同值的数据。

另一部分是,一旦我检索了这些数据,我想使用“sprintf”格式化数据,例如。 sprintf('INSERT INTO postgres VALUES %d,%d.',array1,array2);

这可能吗?

基本上,最后,我想要一个文本文件,其中包含存储的所有数据的 SQL INSERT 语句,我可以在 PostgreSQL 中轻松执行它们。

我做了一个测试代码,它打开一个文本文件,复制数据并将其保存在一个新的文本文件中。

fid = fopen('sample.txt');    
readfile = fread(fid, '*char');    
fclose(fid);                        
output = fopen('output_sample.txt', 'wt');
fprintf(output,'%s \n', readfile);
fclose(output);

谢谢。

I am trying to retrieve particular data from a txt file and would like to save it in a new txt file.

Here's the sample of the data that i would like to retrive from the text file:

//
FORMAT CDDF1.0
DOMAIN 1cuk003
VERSION   3.1.0
VERDATE   20-Jan-2007
NAME   Ruva protein. Chain: null. Engineered: yes
SOURCE Escherichia coli. Strain: 12 bl21 (de3).  Expressed in: escherichia co
SOURCE li.
CATHCODE  1.10.8.10
CLASS  Mainly Alpha
ARCH   Orthogonal Bundle
TOPOL  Helicase, Ruva Protein; domain 3 
HOMOL  DNA helicase RuvA subunit, C-terminal domain
DLENGTH   48
DSEQH  >pdb|1cuk003
DSEQS  TDDAEQEAVARLVALGYKPQEASRMVSKIARPDASSETLIREALRAAL
NSEGMENTS 1
SEGMENT   1cuk003:1:1
SRANGE START=156  STOP=203
SLENGTH   48
SSEQH  >pdb|1cuk003:1:1
SSEQS  TDDAEQEAVARLVALGYKPQEASRMVSKIARPDASSETLIREALRAAL
ENDSEG
//

From those details, I am trying to retrieve "DOMAIN, SRANGE START AND STOP", but i would just like to select the information, not the header of the file name. For an example, DOMAIN is 1cuk003, i would just like to select "1cuk003" from it.

Do i need to store this data in an array? Or is there any other way that i could use to solve this problem. Also, i have over 10,000 entrires more of this data with different values.

The other part is that, once i have retrived those data, i would then like to format the data using "sprintf", eg. sprintf('INSERT INTO postgres VALUES %d,%d.',array1,array2);

Is this possible?

Basically, in the end, I would like to have a text file, containing SQL INSERT statements of all the data been stored in which i can just easily execute them in PostgreSQL.

I did a test code in which it opens a text file, replicates the data and saves it in a new text file.

fid = fopen('sample.txt');    
readfile = fread(fid, '*char');    
fclose(fid);                        
output = fopen('output_sample.txt', 'wt');
fprintf(output,'%s \n', readfile);
fclose(output);

Thank you.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

〆一缕阳光ご 2024-12-31 20:34:12

有几个 MATLAB 函数可以派上用场:

  • fgetl:读取单个函数文件中的行
  • strtok:分割字符串
  • 开关:从不同的操作中进行选择
  • regexp:匹配正则表达式

有了这些,基本工作流程如下:

domain = '';
start = '';
stop = '';
fin = fopen('sample.txt', 'r');
fout = fopen('output.txt', 'w');
% TODO: Add error check!
while true
    line = fgetl(fin); % Get the next line from the file
    if ~ischar(line)
        % End of file
        break;
    end
    [key, value] = strtok(line); % Split line at the first space
    switch key
        case 'DOMAIN'
           % Store domain
           domain = value;
        case 'SRANGE'
           % Retrieve start and stop values
           m = regexp(value, 'START=(\d+)\s*STOP=(\d+)', 'tokens');
           start = m{1};
           stop = m{2};

           % Print result
           fprintf(fout, 'INSERT INTO postgres VALUES %s, %s, %s.\n', domain, start, stop);
    end
end
fclose(fin);
fclose(fout);

我目前无法访问 MATLAB 安装,所以上面的代码没有经过测试。不过,它应该能让你继续前进。

There are several MATLAB functions which come in handy:

  • fgetl: Read a single line from a file
  • strtok: Split a string
  • switch: Choose from different actions
  • regexp: Match regular expressions

Armed with these, the basic workflow is like that:

domain = '';
start = '';
stop = '';
fin = fopen('sample.txt', 'r');
fout = fopen('output.txt', 'w');
% TODO: Add error check!
while true
    line = fgetl(fin); % Get the next line from the file
    if ~ischar(line)
        % End of file
        break;
    end
    [key, value] = strtok(line); % Split line at the first space
    switch key
        case 'DOMAIN'
           % Store domain
           domain = value;
        case 'SRANGE'
           % Retrieve start and stop values
           m = regexp(value, 'START=(\d+)\s*STOP=(\d+)', 'tokens');
           start = m{1};
           stop = m{2};

           % Print result
           fprintf(fout, 'INSERT INTO postgres VALUES %s, %s, %s.\n', domain, start, stop);
    end
end
fclose(fin);
fclose(fout);

I currently do not have access to a MATLAB installation, so the code above is not tested. It should get you going, though.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文