用于匹配多种日期格式的正则表达式
匹配任何格式的日期的正则表达式应该是什么,例如:
26FEB2009
31DEC2009
27 Mar 2008
30 Jul 2009
26-Feb-2009
27-Aug-2009
29/05/2008
07.11.2008
Jan 11 2008
May 26 2008
该正则表达式应该是什么?
我的正则表达式与 26-Feb-2009 和 26 FEB 2009 匹配,但不与 26FEB2009 匹配。所以如果有人知道的话请更新一下。
(?:^|[^\d\w:])(?'day'\d{1,2})(?:-?st\s+|-?th\s+|-?rd\s+|-?nd\s+|-|\s+)(?'month'Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[uarychilestmbro]*(?:\s*,?\s*|-)(?:'?(?'year'\d{2})|(?'year'\d{4}))(?=$|[^\d\w])
日期 26FEB2009 是其他字符串的子字符串,例如 FUTIDX 26FEB2009 NIFTY 0 并从 html 页面解析,因此我无法设置空格或分隔符。
What should be the regex for matching date of any format like:
26FEB2009
31DEC2009
27 Mar 2008
30 Jul 2009
26-Feb-2009
27-Aug-2009
29/05/2008
07.11.2008
Jan 11 2008
May 26 2008
What should be the regular expression for that?
I have regex that matches with 26-Feb-2009 and 26 FEB 2009 with but not with 26FEB2009. So if any one know then please update it.
(?:^|[^\d\w:])(?'day'\d{1,2})(?:-?st\s+|-?th\s+|-?rd\s+|-?nd\s+|-|\s+)(?'month'Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[uarychilestmbro]*(?:\s*,?\s*|-)(?:'?(?'year'\d{2})|(?'year'\d{4}))(?=$|[^\d\w])
The date 26FEB2009 is substring of other string like FUTIDX 26FEB2009 NIFTY 0 and parsed from html page, so I can not set the whitespace or delimiter.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我建议您不要使用正则表达式来解析日期,甚至强烈反对使用正则表达式来解析 HTML。对于解析日期,您可以查看 TryParseExact 方法和使用 DOM 解析器解析 HTML,例如 Html Agility Pack:
I would advice you against using regex for parsing dates and even strongly against using regex for parsing HTML. For parsing dates you may take a look at the TryParseExact method and for parsing HTML a DOM parser such as Html Agility Pack:
如果它匹配 26 FEB 2009 而不是 26FEB2009,听起来您需要将每个日期段之间的空格和分隔符(“-”和“/”)设为可选。
+ 元字符指定一个或多个,请考虑使用 *(零个或多个)作为空格。
编辑
我的意思是,如果您的正则表达式将日期与空格/分隔符匹配,但不匹配没有其中任何一个的日期,即 26FEB2009,那么听起来您正在指定空格/分隔符对于匹配。
这是我快速拼凑起来的一些内容:
您可能想检查它是否缺少您想要的某些功能,但它与您的所有示例相匹配。
If it's matching 26 FEB 2009 and not 26FEB2009, sounds like you need to make the whitespace and delimiter character("-" and "/") between each date segment optional.
The + meta character specifies one or more, consider using * (zero or more) for the whitespace.
EDIT
What I meant was, if your regular expression is matching dates with the whitespace/delimiter character, but is not matching the dates without either of them i.e 26FEB2009, then it sounds like you're specifying that the whitespace/delimiter be compulsory for a match.
Here's something I quickly knocked together:
You might want to check that it's not missing certain features that you want, but it matches all of your examples.