在 Perl 中使用字符串掩码

发布于 2024-10-14 16:58:46 字数 2209 浏览 6 评论 0原文

我有一个程序,允许用户指定一个掩码,例如 MM-DD-YYYY,并将其与字符串进行比较。在字符串中,MM 将被假定为月份,DD 将被假定为该月中的第几天,而 YYYY 将被假定为年份。其他所有内容都必须完全匹配:

  • 字符串:12/31/2010 掩码 MM-DD-YYYY:失败:必须使用斜杠而不是破折号
  • 字符串:12/31/2010 掩码 DD/MM/YYYY:失败:月份必须是第二个,并且有没有月份 31
  • 字符串:12/31-11 掩码:MM/DD-YY:通过:字符串与掩码匹配。

现在,我使用 indexsubstr 提取月、日和年,然后使用 xor 为所有内容生成掩码别的。这似乎有点不优雅,我想知道是否有更好的方法来做到这一点:

my $self = shift;
my $date = shift;

my $format = $self->Format();

my $month;
my $year;
my $day;

my $monthIndex;
my $yearIndex;
my $dayIndex;

#
# Pull out Month, Day, and Year
#
if (($monthIndex = index($format, "MM")) != -1) {
    $month = substr($date, $monthIndex, 2);
}

if (($dayIndex = index($format, "DD")) != -1) {
    $day = substr($date, $dayIndex, 2);
}

if (($yearIndex = index($format, "YYYY")) != -1) {
    $year = substr($date, $yearIndex, 4);
}
elsif (($yearIndex = index($format, "YY")) != -1) {
    $year = substr($date, $yearIndex, 2);
    if ($year < 50) {
        $year += 2000;
    }
    else {
        $year += 1900;
    }
}

#
# Validate the Rest of Format
#

(my $restOfFormat = $format) =~ s/[MDY]/./g;    #Month Day and Year can be anything
if ($date !~ /^$restOfFormat$/) {
    return; #Does not match format
}
[...More Stuff before I return a true value...]

我正在为日期、时间执行此操作(使用 HHMM、< em>SS 和 A/*AA*),以及我的代码中的 IP 地址。


顺便说一句,我尝试使用正则表达式从字符串中提取日期,但它更混乱:

#-----------------------------------------------------------------------
# FIND MONTH
#
    my $mask = "M" x length($format);  #All M's the length of format string

    my $monthMask = ($format ^ $mask);      #Bytes w/ "M" will be "NULL"
    $monthMask =~ s/\x00/\xFF/g;    #Change Null bytes to "FF"
    $monthMask =~ s/[^\xFF]/\x00/g; #Null out other bytes

    #
    #   ####Mask created! Apply mask to Date String
    #

    $month = ($monthMask & $date);  #Nulls or Month Value
    $month =~ s/\x00//g;            #Remove Null bytes from string
#
#-----------------------------------------------------------------------

这是一个巧妙的编程技巧,但很难准确理解我在做什么,因此其他人很难维护。

I have a program that allows a user to specify a mask such as MM-DD-YYYY, and compare it to a string. In the string, the MM will be assumed to be a month, DD will be the day of the month, and YYYY will be the year. Everything else must match exactly:

  • String: 12/31/2010 Mask MM-DD-YYYY: Fail: Must use slashes and not dashes
  • String: 12/31/2010 Mask DD/MM/YYYY: Fail: Month must be second and there's no month 31.
  • String: 12/31-11 Mask: MM/DD-YY: Pass: String matches mask.

Right now, I use index and substr to pull out the month, day, and year, then I use xor to generate a mask for everything else. It seems a bit inelegant, and I was wondering if there's a better way of doing this:

my $self = shift;
my $date = shift;

my $format = $self->Format();

my $month;
my $year;
my $day;

my $monthIndex;
my $yearIndex;
my $dayIndex;

#
# Pull out Month, Day, and Year
#
if (($monthIndex = index($format, "MM")) != -1) {
    $month = substr($date, $monthIndex, 2);
}

if (($dayIndex = index($format, "DD")) != -1) {
    $day = substr($date, $dayIndex, 2);
}

if (($yearIndex = index($format, "YYYY")) != -1) {
    $year = substr($date, $yearIndex, 4);
}
elsif (($yearIndex = index($format, "YY")) != -1) {
    $year = substr($date, $yearIndex, 2);
    if ($year < 50) {
        $year += 2000;
    }
    else {
        $year += 1900;
    }
}

#
# Validate the Rest of Format
#

(my $restOfFormat = $format) =~ s/[MDY]/./g;    #Month Day and Year can be anything
if ($date !~ /^$restOfFormat$/) {
    return; #Does not match format
}
[...More Stuff before I return a true value...]

I'm doing this for a date, time (using HH, MM, SS, and A/*AA*), and IP addresses in my code.


BTW, I tried using regular expressions to pull the date from the string, but it's even messier:

#-----------------------------------------------------------------------
# FIND MONTH
#
    my $mask = "M" x length($format);  #All M's the length of format string

    my $monthMask = ($format ^ $mask);      #Bytes w/ "M" will be "NULL"
    $monthMask =~ s/\x00/\xFF/g;    #Change Null bytes to "FF"
    $monthMask =~ s/[^\xFF]/\x00/g; #Null out other bytes

    #
    #   ####Mask created! Apply mask to Date String
    #

    $month = ($monthMask & $date);  #Nulls or Month Value
    $month =~ s/\x00//g;            #Remove Null bytes from string
#
#-----------------------------------------------------------------------

It's a neat programming trick, but it was pretty hard to understand exactly what I was doing and thus would make it hard for someone else to maintain.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

ㄖ落Θ余辉 2024-10-21 16:58:46

另一种选择可能是将您的模式重写为 strftime/strptime 模式 并使用这些功能进行测试。我正在使用核心 Time::Piece 模块中包含的版本。

use Time::Piece;

test('12/31/2010' => 'MM-DD-YYYY');
test('12/31/2010' => 'DD/MM/YYYY');
test('12/31-11'   => 'MM/DD-YY');

sub test {
    my ($time, $mask) = @_;
    my $t = eval { Time::Piece->strptime($time, make_format_from($mask)) };
    print "String: $time  Mask: $mask  "
      . (defined $t ? "Pass: ".$t->ymd : "Fail"), "\n";
}

sub make_format_from {
    my $mask = shift;
    for($mask) {
        s/YYYY/%Y/;
        s/YY/%y/;
        s/MM/%m/;
        s/DD/%d/;
    }
    return $mask;
}

这段代码产生

String: 12/31/2010  Mask: MM-DD-YYYY  Fail
String: 12/31/2010  Mask: DD/MM/YYYY  Fail
String: 12/31-11  Mask: MM/DD-YY  Pass: 2011-12-31

Another option could be to rewrite your pattern into strftime/strptime pattern and test with those functions. I am using versions included in core Time::Piece module.

use Time::Piece;

test('12/31/2010' => 'MM-DD-YYYY');
test('12/31/2010' => 'DD/MM/YYYY');
test('12/31-11'   => 'MM/DD-YY');

sub test {
    my ($time, $mask) = @_;
    my $t = eval { Time::Piece->strptime($time, make_format_from($mask)) };
    print "String: $time  Mask: $mask  "
      . (defined $t ? "Pass: ".$t->ymd : "Fail"), "\n";
}

sub make_format_from {
    my $mask = shift;
    for($mask) {
        s/YYYY/%Y/;
        s/YY/%y/;
        s/MM/%m/;
        s/DD/%d/;
    }
    return $mask;
}

This code yields

String: 12/31/2010  Mask: MM-DD-YYYY  Fail
String: 12/31/2010  Mask: DD/MM/YYYY  Fail
String: 12/31-11  Mask: MM/DD-YY  Pass: 2011-12-31
萧瑟寒风 2024-10-21 16:58:46

您已经在此代码中使用了一些正则表达式。为什么不将用户的掩码转换为模式并使用正则表达式直接验证输入?例如

$mask =~ s/YYYY/\\d{4}/;
# or:  $mask =~ s/YYYY/[12][0-9]{3}/
$mask =~ s/MM/(0[1-9]|1[0-2])/;               # MM => 01 - 12
$mask =~ s/DD/(0[1-9]|[12][0-9]|3[01])/;      # DD => 01 - 31
$mask =~ s/YY/\\d{2}/;                        # YY => 00 - 99
$mask = '^' . $mask . '

,这会将用户掩码 MM-DD-YY 编译为模式
<代码>^(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])-\d{2}$< /code>,您可以使用以下命令进行测试:

if ($input =~ qr/$mask/) {
    print "Input is valid\n";
} else {
    print "Input is invalid\n";
}
;

,这会将用户掩码 MM-DD-YY 编译为模式
<代码>^(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])-\d{2}$< /code>,您可以使用以下命令进行测试:

You are already using some regular expressions in this code. Why not convert the user's mask into a pattern and use a regular expression to validate the input directly? Say,

$mask =~ s/YYYY/\\d{4}/;
# or:  $mask =~ s/YYYY/[12][0-9]{3}/
$mask =~ s/MM/(0[1-9]|1[0-2])/;               # MM => 01 - 12
$mask =~ s/DD/(0[1-9]|[12][0-9]|3[01])/;      # DD => 01 - 31
$mask =~ s/YY/\\d{2}/;                        # YY => 00 - 99
$mask = '^' . $mask . '

So for example, this would compile the user mask MM-DD-YY into the pattern
^(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])-\d{2}$, which you could test with:

if ($input =~ qr/$mask/) {
    print "Input is valid\n";
} else {
    print "Input is invalid\n";
}
;

So for example, this would compile the user mask MM-DD-YY into the pattern
^(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])-\d{2}$, which you could test with:

耀眼的星火 2024-10-21 16:58:46

您可以使用正则表达式进行简化:

对于 MM/DD-YY

die "Wrong format" except $date =~ /([01][0-9])\/([ 0-3][0-9])-([0-9][0-9])/;

如果匹配,则括号捕获不同部分,可称为 $1$2 等。然后使用这些变量进行进一步测试,例如月份是否在 [1,12] 之间。

顺便说一句,这种模式与 2000 年后不兼容......

You can simplify by using regular expressions:

For MM/DD-YY:

die "Wrong format" unless $date =~ /([01][0-9])\/([0-3][0-9])-([0-9][0-9])/;

If it matches, the parentheses capture the different parts, and can be referred to as $1, $2 etc.. Then use those variables for further testing, e.g. if month is between [1,12].

Btw., this pattern is not y2k compatible...

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文