有没有更简单的方法来提取这些数据?

发布于 2024-07-25 08:02:50 字数 1939 浏览 1 评论 0原文

my $changelog = "/etc/webmin/Pserver_Panel/changelog.cgi";
my $Milestone;
open (PREFS, $changelog);
    while (<PREFS>)
    {
        if ($_ =~ m/^<h1>(.*)[ ]Milestone.*$/g) {
            $Milestone=$1;
            last;
        }
    }
close(PREFS);

以下是从中提取数据的示例:

<h1>1.77 Milestone</h1>
    <h3>    6/26/2009       </h3><ul style="margin-top:0px">
        <li type=circle>    Standard code house cleaning and added better compatbility for apache conversion.
    </ul>
    <h3>    6/21/2009       </h3><ul style="margin-top:0px">
        <li type=square>    Fixed Autofix so that it extracts to the right directory.
    </ul>
    <h3>    6/11/2009       </h3><ul style="margin-top:0px">
        <li type=circle>    Updated FTP link on index page to go to net2ftp, an online ftp file manager.
    </ul>
<h1>1.76 Milestone</h1>
    <h3>    4/14/2009       </h3><ul style="margin-top:0px">
        <li type=square>    Corrected a broken hyperlink to regular expressions in "View Chat Log"
        <li type=circle>    Changed the default number of lines back from 25 to 10 on both Chat and Pserver Logs.
        <li type=circle>    Noted in "View Pserver Log" search is case-sensitive and regular expression supported.
    </ul>
    <h3>    4/13/2009       </h3><ul style="margin-top:0px">
        <li type=disc>      Added AutoFix to the panel which will automatically fix prop errors.
        <li type=circle>    Updated error display to allow more detailed errors.
    </ul>
    <h3>    4/12/2009       </h3><ul style="margin-top:0px">
        <li type=circle>    Fixed start/stop/restart to be more reliable.
    </ul>
my $changelog = "/etc/webmin/Pserver_Panel/changelog.cgi";
my $Milestone;
open (PREFS, $changelog);
    while (<PREFS>)
    {
        if ($_ =~ m/^<h1>(.*)[ ]Milestone.*$/g) {
            $Milestone=$1;
            last;
        }
    }
close(PREFS);

Here is an example of the data its extracting from:

<h1>1.77 Milestone</h1>
    <h3>    6/26/2009       </h3><ul style="margin-top:0px">
        <li type=circle>    Standard code house cleaning and added better compatbility for apache conversion.
    </ul>
    <h3>    6/21/2009       </h3><ul style="margin-top:0px">
        <li type=square>    Fixed Autofix so that it extracts to the right directory.
    </ul>
    <h3>    6/11/2009       </h3><ul style="margin-top:0px">
        <li type=circle>    Updated FTP link on index page to go to net2ftp, an online ftp file manager.
    </ul>
<h1>1.76 Milestone</h1>
    <h3>    4/14/2009       </h3><ul style="margin-top:0px">
        <li type=square>    Corrected a broken hyperlink to regular expressions in "View Chat Log"
        <li type=circle>    Changed the default number of lines back from 25 to 10 on both Chat and Pserver Logs.
        <li type=circle>    Noted in "View Pserver Log" search is case-sensitive and regular expression supported.
    </ul>
    <h3>    4/13/2009       </h3><ul style="margin-top:0px">
        <li type=disc>      Added AutoFix to the panel which will automatically fix prop errors.
        <li type=circle>    Updated error display to allow more detailed errors.
    </ul>
    <h3>    4/12/2009       </h3><ul style="margin-top:0px">
        <li type=circle>    Fixed start/stop/restart to be more reliable.
    </ul>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

哆兒滾 2024-08-01 08:02:50

接下来,您将必须解析里程碑之间的项目。 帮自己一个忙,不再担心代码行并使用 HTML 解析器,例如 HTML: :TokeParser:

#!/usr/bin/perl

use strict;
use warnings;

use HTML::TokeParser;

my $parser = HTML::TokeParser->new( \*DATA );

while ( my $token = $parser->get_token ) {
    if ( $token->[0] eq 'S' ) {
        if ( $token->[1] eq 'h1') {
            my ($milestone) = split ' ', $parser->get_text('/h1');
            print "Milestone is '$milestone'\n";
        }
    }
}

__DATA__
<h1>1.77 Milestone</h1>
...

C:\Temp> vbn
Milestone is '1.77'
Milestone is '1.76'

Next, you are going to have to parse the items between milestones. Do yourself a favor, stop worrying about lines of code and use an HTML Parser, such as HTML::TokeParser:

#!/usr/bin/perl

use strict;
use warnings;

use HTML::TokeParser;

my $parser = HTML::TokeParser->new( \*DATA );

while ( my $token = $parser->get_token ) {
    if ( $token->[0] eq 'S' ) {
        if ( $token->[1] eq 'h1') {
            my ($milestone) = split ' ', $parser->get_text('/h1');
            print "Milestone is '$milestone'\n";
        }
    }
}

__DATA__
<h1>1.77 Milestone</h1>
...

C:\Temp> vbn
Milestone is '1.77'
Milestone is '1.76'
深海夜未眠 2024-08-01 08:02:50

我想要一首单线,这个怎么样?

perl -nle 'if (/^<h1>(.*)[ ]Milestone.*$/g){ print $1; last }' /etc/webmin/Pserver_Panel/changelog.cgi

I you want a one-liner, how about this one?

perl -nle 'if (/^<h1>(.*)[ ]Milestone.*$/g){ print $1; last }' /etc/webmin/Pserver_Panel/changelog.cgi
遗失的美好 2024-08-01 08:02:50

另一条从命令行获取它的行:

perl -ne'print $1 and exit if /<h1>(.*?)\s+Milestone/' /etc/webmin/Pserver_Panel/changelog.cgi

Another one-liner to grab it from the command line:

perl -ne'print $1 and exit if /<h1>(.*?)\s+Milestone/' /etc/webmin/Pserver_Panel/changelog.cgi
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文