在 Perl 中提取日期掩码的正则表达式是什么?

发布于 2024-12-20 13:28:55 字数 389 浏览 0 评论 0原文

我有一个 perl 字符串,其中包含目录规范。如果字符串包含构成日期掩码的任何单个子字符串或子字符串组合,我想提取该子字符串。例如,目录规范可能是:

/mydir/data/YYYYMMDD

我希望能够提取“YYYYMMDD”字符串。但是,路径的该部分可以是以下字符串的任何单个或组合:

YY
YYYY
MM
DD

因此目录规范字符串可以读取:

   /mydir/data/DD/data2

并且我希望作为正则表达式比较的结果返回“DD”。当字符串必须包含一个或多个日期掩码字符串并且该字符串必须位于两个“/”字符之间或存在于字符串末尾时,如何捕获该字符串?

I have a string in perl that contains a directory specification. If the string contains any individual or combination of substrings that comprise a date mask, I want to extract that substring. For example, the directory spec may be:

/mydir/data/YYYYMMDD

I want to be able to extract the "YYYYMMDD" string. However that portion of the path could be any individual or combination of the following strings:

YY
YYYY
MM
DD

So the directory spec string could read:

   /mydir/data/DD/data2

and I want the "DD" returned as a result of the regex comparison. How do I capture the string when it must contain one or more of those date mask strings and that string must be between two "/" characters or exist at the end of the string?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

一刻暧昧 2024-12-27 13:28:55

我假设 YYYYYY 不应同时出现在同一模式中,因为否则就没有意义。

use Data::Munge qw(list2re);
use List::MoreUtils qw(uniq);
use Algorithm::Combinatorics qw(variations);
use Perl6::Take qw(gather take);

list2re
uniq
gather {
    for my $n ([qw(YYYY MM DD)], [qw(YY MM DD)]) {
        for my $k (1..scalar @$n) {
            take map { join q(), @$_ } variations($n, $k)
        }
    }
}

表达式返回正则表达式<代码>(?^:DDMMYYYY|DDYYYYMM|MMDDYYYY|MMYYYYDD|YYYYDDMM|YYYYMMDD|DDMMYY|DDYYMM|DDYYYY|MMDDYY|M MYYDD|MMYYYY|YYDDMM|YYMMDD|YYYYDD|YYYYMM|DDMM|DDYY|MMDD|MMYY|YYDD|YYMM|YYYY|DD|MM|YY)。 (半)函数式编程获胜!

I'm making the assumption that YYYY and YY shall not both appear in the same pattern, because otherwise it does not make sense.

use Data::Munge qw(list2re);
use List::MoreUtils qw(uniq);
use Algorithm::Combinatorics qw(variations);
use Perl6::Take qw(gather take);

list2re
uniq
gather {
    for my $n ([qw(YYYY MM DD)], [qw(YY MM DD)]) {
        for my $k (1..scalar @$n) {
            take map { join q(), @$_ } variations($n, $k)
        }
    }
}

The expression returns the regex (?^:DDMMYYYY|DDYYYYMM|MMDDYYYY|MMYYYYDD|YYYYDDMM|YYYYMMDD|DDMMYY|DDYYMM|DDYYYY|MMDDYY|MMYYDD|MMYYYY|YYDDMM|YYMMDD|YYYYDD|YYYYMM|DDMM|DDYY|MMDD|MMYY|YYDD|YYMM|YYYY|DD|MM|YY). (Semi-)Functional programming for the win!

别念他 2024-12-27 13:28:55

我假设只有一个“日期”组件,或者如果没有,您需要第一个:

#!/usr/bin/perl
use warnings;
use strict;

my @paths = qw(
    /mydir/data/YYYYMMDD
    /mydir/data/YY/data2
    /mydir/data/YYMM/data2
    /mydir/data/DD/data2
);

foreach my $path (@paths) {
    my($date) = grep /^(([YMD])\2)+$/, split '/', $path;
    print "$path: $date\n";
}

I assume that there is only one "date" component, or if not, that you want the 1st one:

#!/usr/bin/perl
use warnings;
use strict;

my @paths = qw(
    /mydir/data/YYYYMMDD
    /mydir/data/YY/data2
    /mydir/data/YYMM/data2
    /mydir/data/DD/data2
);

foreach my $path (@paths) {
    my($date) = grep /^(([YMD])\2)+$/, split '/', $path;
    print "$path: $date\n";
}
魄砕の薆 2024-12-27 13:28:55

假设掩码字段始终按 Y - M - D 的顺序排列,这将满足您的需要:

my ($mask) = $path =~ m{ / ( (?:YY){0,2} (?:MM)? (?:DD)? ) (?:/|$) }x;

Assuming the mask fields are always in the order Y - M - D, this will do what you need:

my ($mask) = $path =~ m{ / ( (?:YY){0,2} (?:MM)? (?:DD)? ) (?:/|$) }x;
咽泪装欢 2024-12-27 13:28:55

我会使用

my ($date) = m{/([0-9]{2,8})(?:/|$)}

并检查是否

not(length($date) % 2)   # $date has even length

,也许还有一些检查是否有效的组合。

更新: 好的,要只获取掩码,而不是数字,您可以将其更改为“

my ($date) = m{/([YMD]{2,8})(?:/|$)};
my $check = $date;
$check =~ s/YYYY/y/;
$check =~ s/MM//;
$check =~ s/DD//;
print "Matches $date\n" if grep $_ eq $check, (q{}, 'y', 'YY');

这应该排除所有无效组合,例如 YYDDYY 或 YYYYMMYY 等”。

I'd use

my ($date) = m{/([0-9]{2,8})(?:/|$)}

and check whether

not(length($date) % 2)   # $date has even length

and maybe some checks for valid combinations.

Update: OK, to just get the mask, not the numbers, you can change this to

my ($date) = m{/([YMD]{2,8})(?:/|$)};
my $check = $date;
$check =~ s/YYYY/y/;
$check =~ s/MM//;
$check =~ s/DD//;
print "Matches $date\n" if grep $_ eq $check, (q{}, 'y', 'YY');

This should exclude all invalid combinations like YYDDYY or YYYYMMYY and so on.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文