用于提取 POP3 标头的正则表达式
我正在尝试弄清楚如何使用此正则表达式
^(?[a-zA-Z-]+)(?(?=:).+)$
Delivered-To: [email protected]
该组还返回我想避免的“:”字符。 我正在努力解决这个问题,但做不到。
需要集体智慧:-)
I'm trying to work out how to extract POP3 headers using this regex
^(?[a-zA-Z-]+)(?(?=:).+)$
Delivered-To: [email protected]
The group returns the ':' character as well which I want to avoid. I'm busting trying to work this out but can't.
Need collective wisdom :-)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
请注意,这不会处理包装的标头。 事实上,该正则表达式将采用包装的标头,并将其添加到真正的标头中。 特别是如果包装的标头在以下几行中没有“:”。
以 Sergej Andrejev 的正则表达式为基础,这个正则表达式将处理不捕获换行的情况:
但是,最好的做法是实际逐行读取标题,并进行相应的解析。 这很痛苦(因为我必须为生产代码这样做),但它是最准确的。
Just so you are aware, this will not handle wrapped headers. In fact, that regex will take a wrapped header, and prepend it to a real header. Especially if the wrapped header doesn't have a ":" in the following lines.
Building upon Sergej Andrejev's Regex, this one will handle not capturing the wrapped lines:
However, the best thing to do, is to actually read the headers line by line, and parse accordingly. It's a pain (as I've had to do it for production code), but it's the most accurate.
我会选择类似的
然后你会有
I would go with something like
Then you would have
抱歉,复制了错误的代码:
^(\S+):\s((\s\S)*)
它适用于多线。
Sorry, copied the wrong code:
^(\S+):\s((\s\S)*)
It works with multi lines.