RFC 2822 日期正则表达式
匹配 RFC 2822 日期的最佳正则表达式是什么?
基本上我想匹配我收到的一些电子邮件中出现的Date: Sun, 19 Feb 2012 16:25:02 +0000
,但最好与语言无关。
我确实在网上找到了下面的正则表达式,但不确定如何使月份语言独立(但仍与其余部分匹配) - 我相信月份应该是规范中的 3 个字符,但不完全确定...
/^(?:(Sun|Mon|Tue|Wed|Thu|Fri|Sat),\s+)?(0[1-9]|[1-2]?[0-9]|3[01])\s+(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s+(19[0-9]{2}|[2-9][0-9]{3})\s+(2[0-3]|[0-1][0-9]):([0-5][0-9])(?::(60|[0-5][0-9]))?\s+([-\+][0-9]{2}[0-5][0-9]|(?:UT|GMT|(?:E|C|M|P)(?:ST|DT)|[A-IK-Z]))(\s+|\(([^\(\)]+|\\\(|\\\))*\))*$/
What's the best regex to match an RFC 2822 date?
Basically I would like to match Date: Sun, 19 Feb 2012 16:25:02 +0000
that appears in some emails I receive, but ideally be language independent.
I did find the below regex online, but not sure how to make month language independent (yet still match the rest) - i believe that month should be 3 characters in the spec, but not totally sure...
/^(?:(Sun|Mon|Tue|Wed|Thu|Fri|Sat),\s+)?(0[1-9]|[1-2]?[0-9]|3[01])\s+(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s+(19[0-9]{2}|[2-9][0-9]{3})\s+(2[0-3]|[0-1][0-9]):([0-5][0-9])(?::(60|[0-5][0-9]))?\s+([-\+][0-9]{2}[0-5][0-9]|(?:UT|GMT|(?:E|C|M|P)(?:ST|DT)|[A-IK-Z]))(\s+|\(([^\(\)]+|\\\(|\\\))*\))*$/
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
正如 @tripleee 指出的,RFC2822 日期将始终是英语。但是,如果您要从不严格遵循 RFC2822 的源解析日期,并且可能使用不同的语言,则必须确定可能使用的语言集,并创建一个与任何月/日匹配的正则表达式来自任何这些语言的周名称。然后,您可以使用哈希将捕获的月份/星期名称转换为您想要使用的内部表示形式。
As @tripleee pointed out, a RFC2822 date will always be English. But if you are parsing dates from a source which is not strictly following RFC2822, and which might use a different language, you will have to identify the set of languages which might be used, and make a single regex which will match any month/day of week name from any of those languages. Afterwards you can use a hash to convert the captured month/day of week names to the internal representation you want to use.