如何将多行处理/存储到从 perl 文件中读取的单个字段中?
我正在尝试用 perl 处理文本文件。我需要将文件中的数据存储到数据库中。 我遇到的问题是某些字段包含换行符,这让我有点困惑。 包含这些字段的最佳方式是什么?
示例 data.txt 文件:
ID|Title|Description|Date
1|Example 1|Example Description|10/11/2011
2|Example 2|A long example description
Which contains
a bunch of newlines|10/12/2011
3|Example 3|Short description|10/13/2011
当前(损坏的)Perl 脚本(示例):
#!/usr/bin/perl -w
use strict;
open (MYFILE, 'data.txt');
while (<MYFILE>) {
chomp;
my ($id, $title, $description, $date) = split(/\|/);
if ($id ne 'ID') {
# processing certain fields (...)
# insert into the database (example)
$sqlInsert->execute($id, $title, $description, $date);
}
}
close (MYFILE);
正如您从示例中看到的,在 ID 2 的情况下,它被分成几行,从而在尝试引用那些未定义的变量时导致错误。您如何将它们分组到正确的领域?
提前致谢! (我希望问题足够清楚,很难定义标题)
I am trying to process a text file in perl. I need to store the data from the file into a database.
The problem that I'm having is that some fields contain a newline, which throws me off a bit.
What would be the best way to contain these fields?
Example data.txt file:
ID|Title|Description|Date
1|Example 1|Example Description|10/11/2011
2|Example 2|A long example description
Which contains
a bunch of newlines|10/12/2011
3|Example 3|Short description|10/13/2011
The current (broken) Perl script (example):
#!/usr/bin/perl -w
use strict;
open (MYFILE, 'data.txt');
while (<MYFILE>) {
chomp;
my ($id, $title, $description, $date) = split(/\|/);
if ($id ne 'ID') {
# processing certain fields (...)
# insert into the database (example)
$sqlInsert->execute($id, $title, $description, $date);
}
}
close (MYFILE);
As you can see from the example, in the case of ID 2, it's broken into several lines causing errors when attempting to reference those undefined variables. How would you group them into the correct field?
Thanks in advance! (I hope the question was clear enough, difficult to define the title)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我只是在分割线之前计算分隔符的数量。如果还不够,请阅读下一行并添加它。
tr
运算符是一种高效的计算字符数的方法。I would just count the number of separators before splitting the line. If you don't have enough, read the next line and append it. The
tr
operator is an efficient way to count characters.阅读下一行,直到字段数量满足您的需要。类似的东西(我还没有测试过该代码):
Read next line until number of fields is what you need. Something like that (I haven't tested that code):
如果您可以更改 data.txt 文件以将管道分隔符作为每行/记录中的最后一个字符,那么您可以吞入整个文件,直接拆分到原始字段中。然后这段代码将执行您想要的操作:
If you could change your data.txt file to include the pipe separator as the last character in every line/record, you could slurp in the whole file, splitting directly into the raw fields. This code would then do what you want: