如何在 VB6 中使用 MSHTML 解析器去除所有 HTML 标签?

发布于 2024-11-02 16:44:31 字数 42 浏览 0 评论 0原文

如何在 VB6 中使用 MSHTML 解析器去除所有 HTML 标签?

How to strip ALL HTML tags using MSHTML Parser in VB6?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

长不大的小祸害 2024-11-09 16:44:31

这是改编自 CodeGuru 的代码。非常感谢原作者:
http://www.codeguru.com/vb/vb_internet/html/article .php/c4815

如果您需要从网络下载 HTML,请检查原始来源。例如:

Set objDocument = objMSHTML.createDocumentFromUrl("http://google.com", vbNullString)

我不需要从网络下载 HTML 存根 - 我的存根已经在内存中了。所以最初的来源不太适合我。我的主要目标只是让一个合格的 DOM 解析器为我从用户生成的内容中剥离 HTML。有些人会说,“为什么不直接使用一些正则表达式来剥离 HTML 呢?”祝你好运!

添加对以下内容的引用: Microsoft HTML 对象库

这与运行 Internet Explorer (IE) 的 HTML 解析器相同 - 让我们开始质问吧。好吧,别闹了……

这是我使用的代码:

Dim objDocument As MSHTML.HTMLDocument
Set objDocument = New MSHTML.HTMLDocument

'NOTE: txtSource is an instance of a simple TextBox object
objDocument.body.innerHTML = "<p>Hello World!</p> <p>Hello Jason!</p> <br/>Hello Bob!"
txtSource.Text = objDocument.body.innerText

txtSource.Text 中的结果文本是我的用户内容,删除了所有 HTML。干净且可维护 - 对我来说没有克苏鲁之道。

This is adapted from Code over at CodeGuru. Many Many thanks to the original author:
http://www.codeguru.com/vb/vb_internet/html/article.php/c4815

Check the original source if you need to download your HTML from the web. E.g.:

Set objDocument = objMSHTML.createDocumentFromUrl("http://google.com", vbNullString)

I don't need to download the HTML stub from the web - I already had my stub in memory. So the original source didn't quite apply to me. My main goal is just to have a qualified DOM Parser strip the HTML from the User generated content for me. Some would say, "Why not just use some RegEx to strip the HTML?" Good luck with that!

Add a reference to: Microsoft HTML Object Library

This is the same HTML Parser that runs Internet Explorer (IE) - Let the heckling begin. Well, Heckle away...

Here's the code I used:

Dim objDocument As MSHTML.HTMLDocument
Set objDocument = New MSHTML.HTMLDocument

'NOTE: txtSource is an instance of a simple TextBox object
objDocument.body.innerHTML = "<p>Hello World!</p> <p>Hello Jason!</p> <br/>Hello Bob!"
txtSource.Text = objDocument.body.innerText

The resulting text in txtSource.Text is my User's Content stripped of all HTML. Clean and maintainable - No Cthulhu Way for me.

徒留西风 2024-11-09 16:44:31

一种方式:

Function strip(html As String) As String
    With CreateObject("htmlfile")
        .Open
        .write html
        .Close
        strip = .body.outerText
    End With
End Function

对于

?strip("<strong>hello <i>wor<u>ld</u>!</strong><foo> 1234")
hello world! 1234

One way:

Function strip(html As String) As String
    With CreateObject("htmlfile")
        .Open
        .write html
        .Close
        strip = .body.outerText
    End With
End Function

For

?strip("<strong>hello <i>wor<u>ld</u>!</strong><foo> 1234")
hello world! 1234
梦里人 2024-11-09 16:44:31
Public Function ParseHtml(ByVal str As String) As String
    Dim Ret As String, TagOpenend As Boolean, TagClosed As Boolean
    Dim n As Long, sChar As String
    For n = 1 To Len(str)
        sChar = Mid(str, n, 1)
        Select Case sChar
            Case "<"
                TagOpenend = True
            Case ">"
                TagClosed = True
                TagOpenend = False
            Case Else
                If TagOpenend = False Then
                    Ret = Ret & sChar
                End If
        End Select
    Next
    ParseHtml = Ret
End Function

这是我自己使用的一个简单功能。
使用调试窗口

?ParseHtml( "< div >test< /div >" )

测试

我希望这将在不使用外部库的情况下有所帮助

Public Function ParseHtml(ByVal str As String) As String
    Dim Ret As String, TagOpenend As Boolean, TagClosed As Boolean
    Dim n As Long, sChar As String
    For n = 1 To Len(str)
        sChar = Mid(str, n, 1)
        Select Case sChar
            Case "<"
                TagOpenend = True
            Case ">"
                TagClosed = True
                TagOpenend = False
            Case Else
                If TagOpenend = False Then
                    Ret = Ret & sChar
                End If
        End Select
    Next
    ParseHtml = Ret
End Function

This is a simple function i mafe for my own use.
use Debug window

?ParseHtml( "< div >test< /div >" )

test

I hope this will help without using external libraries

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文