C#中通过正则表达式获取图像的SRC
我正在寻找一个正则表达式来隔离 img 的 src 值。 (我知道这不是最好的方法,但这是我在这种情况下必须做的)
我有一个字符串,其中包含简单的 html 代码、一些文本和图像。我需要从该字符串获取 src 属性的值。到目前为止我只能隔离整个标签。
string matchString = Regex.Match(original_text, @"(<img([^>]+)>)").Value;
I'm looking for a regular expression to isolate the src value of an img.
(I know that this is not the best way to do this but this is what I have to do in this case)
I have a string which contains simple html code, some text and an image. I need to get the value of the src attribute from that string. I have managed only to isolate the whole tag till now.
string matchString = Regex.Match(original_text, @"(<img([^>]+)>)").Value;
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
我知道你说你必须使用正则表达式,但如果可能的话我真的会给这个开源项目一个机会:
HtmlAgilityPack
它真的很容易使用,我刚刚发现它,它对我帮助很大,因为我正在做一些较重的 html 解析。它基本上允许您使用 XPATHS 来获取元素。
他们的示例页面有点过时,但 API 确实很容易理解,如果您对 xpath 有点熟悉,您现在就会了解它。
您的查询代码将如下所示:(未编译的代码)
I know you say you have to use regex, but if possible i would really give this open source project a chance:
HtmlAgilityPack
It is really easy to use, I just discovered it and it helped me out a lot, since I was doing some heavier html parsing. It basically lets you use XPATHS to get your elements.
Their example page is a little outdated, but the API is really easy to understand, and if you are a little bit familiar with xpaths you will get head around it in now time
The code for your query would look something like this: (uncompiled code)
我尝试了 Francisco Noriega 的建议,但看起来 HtmlAgilityPack 的 api 已被更改。我是这样解决的:
I tried what Francisco Noriega suggested, but it looks that the api to the HtmlAgilityPack has been altered. Here is how I solved it:
这应该捕获所有 img 标签和仅 src 部分,无论其位于何处(在类之前或之后等)并支持 html/xhtml :D
This should capture all img tags and just the src part no matter where its located (before or after class etc) and supports html/xhtml :D
您想要的正则表达式应该类似于:
希望这有帮助。
The regex you want should be along the lines of:
Hope this helps.
您还可以使用后视来完成此操作,而无需拉出一组,
请记住在需要时转义引号
you can also use a look behind to do it without needing to pull out a group
remember to escape the quotes if needed
这就是我用来从字符串中获取标签的方法:
This is what I use to get the tags out of strings:
这是我使用的:
好的部分是它匹配以下任何一个:
它还可以匹配一些意想不到的场景,例如额外的属性,例如:
Here is the one I use:
The good part is that it matches any of the below:
And it can also match some unexpected scenarios like extra attributes, e.g: