解析perl正则表达式中的多行并提取值
我是 Perl 的初学者。我有一个文本文件,其文本类似于如下所示。我需要提取 VALUE="<NEEDED VALUE>"。就菠菜而言,我应该单独吃沙拉。
如何使用 perl regex 获取值。我需要解析多行才能得到它。即每个 #ifonly --- #endifonly 之间
$cat check.txt
while (<$file>)
{
if (m/#ifonly .+ SPINACH .+ VALUE=(")([\w]*)(") .+ #endifonly/g)
{
my $chosen = $2;
}
}
#ifonly APPLE CARROT SPINACH
VALUE="SALAD" REQUIRED="yes"
QW RETEWRT OIOUR
#endifonly
#ifonly APPLE MANGO ORANGE CARROT
VALUE="JUICE" REQUIRED="yes"
as df fg
#endifonly
I am a beginner in perl. I have a text file with text similar to as below. i need to extract VALUE="<NEEDED VALUE>". Say for SPINACH, i should be getting SALAD alone.
How to use perl regex to get the value. i need to parse multiple lines to get it. ie between each #ifonly --- #endifonly
$cat check.txt
while (<$file>)
{
if (m/#ifonly .+ SPINACH .+ VALUE=(")([\w]*)(") .+ #endifonly/g)
{
my $chosen = $2;
}
}
#ifonly APPLE CARROT SPINACH
VALUE="SALAD" REQUIRED="yes"
QW RETEWRT OIOUR
#endifonly
#ifonly APPLE MANGO ORANGE CARROT
VALUE="JUICE" REQUIRED="yes"
as df fg
#endifonly
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
这使用了 brian d foy 此处。正如链接所述,它使用标量范围运算符/触发器。
This uses a small trick described by brian d foy here. As the link describes, it uses the scalar range operator / flipflop.
如果您的文件非常大(或者由于其他原因您想逐行读取它),您可以按如下方式执行:
In case your file is very big (or you want to read it line by line for some other reason) you could do it as follows:
您可以读取字符串中的文件内容,然后搜索字符串中的模式:
您的原始正则表达式需要一些调整:
非贪婪。
s
修饰符来创建.
也匹配换行符。
Ideone 链接
You can read the file contents in a string and then search for the pattern in the string:
Your original regex needs some tweaking:
non-greedy.
s
modifier to make.
match newline as-well.
Ideone Link
这是基于触发器运算符的另一个答案:
此解决方案将第二个测试应用于范围内的所有行。 @Hugmeir 用于排除开始行和结束行的技巧是不需要的,因为“内部”正则表达式
/^VALUE="(\w+)"/
无论如何都无法匹配它们(我添加了所有正则表达式的^
锚点以双重确保这一点)。Here's another answer based on the flip-flop operator:
This solution applies the second test to all of the lines in the range. The trick @Hugmeir used to exclude the start and end lines isn't needed because the "inner" regex,
/^VALUE="(\w+)"/
, can never match them anyway (I added the^
anchor to all regexes to make doubly sure of that).两天前给出的一个答案中的这两行效率
不是很高。 Perl 可能会以大块的形式读取文件,将这些块分成
<>
的文本行,然后.=
将这些行连接起来以形成一个大字符串。读取文件会更有效。基本样式是更改输入记录分隔符\$
。模块
File::Slurp;
(参见perldoc File::Slurp
)可能会更好。These two lines in one answer given two days ago
are not very efficient. Perl will likely read the file in big chunks, break those chunks into lines of text for the
<>
and then the.=
will join those lines back to make a big string. It would be more efficient to slurp the file. The basic style is to alter\$
the input record separator.The module
File::Slurp;
(seeperldoc File::Slurp
) may be even better.