StreamReader 上的 C# RegEx 将不会返回匹配项
我正在为自己编写一个简单的屏幕抓取应用程序来使用 HTMLAgilityPack 库,在让它在几种不同类型的 HtmlNode 上工作后,我想我会喜欢并为电子邮件地址添加正则表达式。唯一的问题是应用程序从未找到任何匹配项,或者可能找到但未正确返回。即使在已知包含电子邮件地址的网站上也会发生这种情况。有人能发现我在这里做错了什么吗?
string url = String.Format("http://{0}", mainForm.Target);
string reg = "\b[A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,4}\b";
try
{
WebClient wClient = new WebClient();
Stream data = wClient.OpenRead(url);
StreamReader read = new StreamReader(data);
MatchCollection matches = Regex.Matches(read.ReadToEnd(), reg, RegexOptions.IgnoreCase|RegexOptions.Multiline);
foreach (Match match in matches)
{
textBox1.AppendText(match.ToString() + Environment.NewLine);
}
I'm writing myself a simple screen scraping application to play around with the HTMLAgilityPack library, and after getting it to work on several different types of HtmlNodes, I figured I'd get fancy and throw in a Regex for Email addresses as well. The only problem is that the application never finds any matches, or maybe it is but not returning properly. This takes place even on sites known to contain email addresses. Can anyone spot what I'm doing wrong here?
string url = String.Format("http://{0}", mainForm.Target);
string reg = "\b[A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,4}\b";
try
{
WebClient wClient = new WebClient();
Stream data = wClient.OpenRead(url);
StreamReader read = new StreamReader(data);
MatchCollection matches = Regex.Matches(read.ReadToEnd(), reg, RegexOptions.IgnoreCase|RegexOptions.Multiline);
foreach (Match match in matches)
{
textBox1.AppendText(match.ToString() + Environment.NewLine);
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
使用原始字符串:
如果没有它,
\b
就会变成退格键。另外,您的最后一个句点应该是\.
,因此它只匹配文字句点。Use raw strings:
Without that,
\b
becomes backspace. Also, your last period should be\.
, so it only matches a literal period.检查 read.ReadToEnd() 返回的字符串,看看是否可以使用正则表达式在此字符串中找到电子邮件地址。我猜你的问题与 StreamReader 没有任何关系。
Check the string that is returned by read.ReadToEnd() and see if you can find email addresses in this string with your regex. I guess that your problem doesn't have anything to do with StreamReader.