我想从我的字符串中删除所有元素,包括 class
等属性的元素。
我已经在这里检查了,所以REGEX显然不是答案: regex匹配开放的标签,除了XHTML独立的标签
我目前已经有一些正则替代字符串中的标签(请注意,我从不对完整的html文档进行解析,如果很重要的话)并保留了内容: Regex.Replace(s,“< [^>]*(> | $)”,string.empty)
。但是,我只想删除 div
标签并保留内容。
因此,我有:
<div class=""fade-content""><div><span>some content</span></div></div>
<div>some content</div>
所需的输出:
<span>some content</span>
some content
我要去Regex Path Stil,然后尝试类似:&lt; div&gt;。*&lt; \ \/div&gt;
,但这不包括属性。
如何使用vb.net仅删除 div
元素?
I want to remove all elements, including the ones with attributes like class
, from my string.
I already checked here, so regex is apparently not the answer: RegEx match open tags except XHTML self-contained tags
I currently already have something with regex that replaces all tags from a string (note, I'm never parsing a full HTML document if that matters) and preserves the content: Regex.Replace(s, "<[^>]*(>|$)", String.Empty)
. However, I just want the div
tags removed and preserve the content.
So I have:
<div class=""fade-content""><div><span>some content</span></div></div>
<div>some content</div>
Desired output:
<span>some content</span>
some content
I was going the regex path stil, and trying something like: <div>.*<\/div>
, but that excludes divs with attributes.
How can I remove div
elements only, using VB.NET?
发布评论
评论(2)
有几种方法可以做到这一点。第一个简短而简单,如下所示:
这是一个示例:
输出:
There are several ways to do this. One, short and simple, is the following one:
Here is an example:
Output:
通过使用 WebBrowser 控件,无需正则表达式即可实现此目的。尝试以下操作:
ExtractDesiredData:
用法:
资源:
This can be achieved without regular expressions by using a WebBrowser control. Try the following:
ExtractDesiredData:
Usage:
Resources: