访问 WatiN 中的整页源代码

发布于 2024-08-04 23:27:23 字数 342 浏览 4 评论 0原文

作为 WatiN 测试的一部分，我正在寻找一种方法来通过 DTD 验证器传递我的网页，但我还没有找到一种访问原始 HTML 的干净方法。有内置的方法可以做到这一点吗？

我想我可以访问 IPersistStreamInit 接口的 IE.InternetExplorer 和 QueryInterface 属性，并将文档序列化为 IStream >，但我想这似乎是一项相当常见的任务，需要做很多工作。

我在 WatiN 中遗漏了一些明显的东西吗？或者有人能想出比我上面概述的更好的解决方案吗？毕竟该解决方案是非常特定于 IE 的。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

半世蒼涼 2024-08-11 23:27:23

以下是访问源代码的方法：

browser.ActiveElement.Parent.OuterHtml

Here is how you access the source code:

browser.ActiveElement.Parent.OuterHtml

回复收藏 0 原文

哽咽笑 2024-08-11 23:27:23

字符串 html = browser.Body.Parent.OuterHtml;

回复收藏 0 原文

你又不是我 2024-08-11 23:27:23

看来没有更好的办法了。我提交了功能请求并提交而是在 WatiN 的 sourceforge 跟踪器上打一个补丁。

回复收藏 0 原文

森末i 2024-08-11 23:27:23

我想删除一些行来帮助那些努力通过 WatiN 获取网页的原始 HTML 源代码的人，但是，无需修补 WatiN - 只是出于品味问题。

因此，利用 Johan Levin 的补丁，我将以下内容拼凑在一起。确保安全并希望您发现它有用。

    private static TextVariant GetWebPageSource(IE browser)
    {
    IHTMLDocument2 htmlDocument = ((IEDocument)(browser.DomContainer.NativeDocument)).HtmlDocument;
    Encoding encoding = Encoding.GetEncoding(htmlDocument.charset);
        IPersistStreamInit persistStream = (IPersistStreamInit)htmlDocument;
        MinimalIStream stream = new MinimalIStream();
        persistStream.Save(stream, false);
        return new TextVariant(encoding.GetString(stream.ToArray()));
    }

    [Guid("7FD52380-4E07-101B-AE2D-08002B2EC713")]
    [InterfaceTypeAttribute(ComInterfaceType.InterfaceIsIUnknown)]
    public interface IPersistStreamInit
    {
        void GetClassID(out Guid pClassID);
        int IsDirty();
        void Load(IStream pStm);
        void Save(IStream pStm, bool fClearDirty);
        void GetSizeMax(out long pcbSize);
        void InitNew();
    }

    // http://stackoverflow.com/questions/6601355/passing-an-fstream-or-equivalent-from-c-to-c-through-cli
    [ClassInterface(ClassInterfaceType.AutoDispatch)]
    public class MinimalIStream : MemoryStream, IStream
    {
        public MinimalIStream() { }

        public MinimalIStream(byte[] data) : base(data) { }

        #region IStream Members
        public void Write(byte[] pv, int cb, IntPtr pcbWritten)
        {
            base.Write(pv, 0, cb);
            if (pcbWritten != IntPtr.Zero)
                Marshal.WriteInt64(pcbWritten, (long)cb);
        }

        public void Stat(out STATSTG pstatstg, int grfStatFlag)
        {
            pstatstg = new STATSTG();
            pstatstg.cbSize = base.Length;
        }

        public void Read(byte[] pv, int cb, IntPtr pcbRead)
        {
            long bytes_read = base.Read(pv, 0, cb);
            if (pcbRead != IntPtr.Zero) Marshal.WriteInt64(pcbRead, bytes_read);
        }

        public void Seek(long dlibMove, int dwOrigin, IntPtr plibNewPosition)
        {
            long pos = base.Seek(dlibMove, (SeekOrigin)dwOrigin);
            if (plibNewPosition != IntPtr.Zero) Marshal.WriteInt64(plibNewPosition, pos);
        }

        public void Clone(out IStream ppstm)
        {
            ppstm = null;
        }

        public void Commit(int grfCommitFlags)
        {
        }

        public void CopyTo(IStream pstm, long cb, IntPtr pcbRead, IntPtr pcbWritten)
        {
        }

        public void LockRegion(long libOffset, long cb, int dwLockType)
        {
        }

        public void SetSize(long libNewSize)
        {
        }

        public void Revert()
        {
        }

        public void UnlockRegion(long libOffset, long cb, int dwLockType)
        {
        }
        #endregion
    }

Thought to drop some lines to help anyone struggling out there to get the pristine HTML source of a web page via WatiN without, however, patching WatiN - just as a matter of taste.

So capitalizing on Johan Levin's patch I bolted together the following. Be safe and hope you find it useful.

    private static TextVariant GetWebPageSource(IE browser)
    {
    IHTMLDocument2 htmlDocument = ((IEDocument)(browser.DomContainer.NativeDocument)).HtmlDocument;
    Encoding encoding = Encoding.GetEncoding(htmlDocument.charset);
        IPersistStreamInit persistStream = (IPersistStreamInit)htmlDocument;
        MinimalIStream stream = new MinimalIStream();
        persistStream.Save(stream, false);
        return new TextVariant(encoding.GetString(stream.ToArray()));
    }

    [Guid("7FD52380-4E07-101B-AE2D-08002B2EC713")]
    [InterfaceTypeAttribute(ComInterfaceType.InterfaceIsIUnknown)]
    public interface IPersistStreamInit
    {
        void GetClassID(out Guid pClassID);
        int IsDirty();
        void Load(IStream pStm);
        void Save(IStream pStm, bool fClearDirty);
        void GetSizeMax(out long pcbSize);
        void InitNew();
    }

    // http://stackoverflow.com/questions/6601355/passing-an-fstream-or-equivalent-from-c-to-c-through-cli
    [ClassInterface(ClassInterfaceType.AutoDispatch)]
    public class MinimalIStream : MemoryStream, IStream
    {
        public MinimalIStream() { }

        public MinimalIStream(byte[] data) : base(data) { }

        #region IStream Members
        public void Write(byte[] pv, int cb, IntPtr pcbWritten)
        {
            base.Write(pv, 0, cb);
            if (pcbWritten != IntPtr.Zero)
                Marshal.WriteInt64(pcbWritten, (long)cb);
        }

        public void Stat(out STATSTG pstatstg, int grfStatFlag)
        {
            pstatstg = new STATSTG();
            pstatstg.cbSize = base.Length;
        }

        public void Read(byte[] pv, int cb, IntPtr pcbRead)
        {
            long bytes_read = base.Read(pv, 0, cb);
            if (pcbRead != IntPtr.Zero) Marshal.WriteInt64(pcbRead, bytes_read);
        }

        public void Seek(long dlibMove, int dwOrigin, IntPtr plibNewPosition)
        {
            long pos = base.Seek(dlibMove, (SeekOrigin)dwOrigin);
            if (plibNewPosition != IntPtr.Zero) Marshal.WriteInt64(plibNewPosition, pos);
        }

        public void Clone(out IStream ppstm)
        {
            ppstm = null;
        }

        public void Commit(int grfCommitFlags)
        {
        }

        public void CopyTo(IStream pstm, long cb, IntPtr pcbRead, IntPtr pcbWritten)
        {
        }

        public void LockRegion(long libOffset, long cb, int dwLockType)
        {
        }

        public void SetSize(long libNewSize)
        {
        }

        public void Revert()
        {
        }

        public void UnlockRegion(long libOffset, long cb, int dwLockType)
        {
        }
        #endregion
    }

回复收藏 0 原文

心意如水 2024-08-11 23:27:23

我发现：

browser.ActiveElement.Parent.OuterHtml

并不总是能得到一切，因为这取决于你的“ActiveElement”，因此：

browser.Body.Parent.OuterHtml

似乎工作得更好。（浏览器是您的 IE 实例）

尽管我相信 Johan Levin 所说的 DOM 被序列化回文本格式是正确的。
因此，通过 URL 获取文档（不使用 WatiN）来验证它不是更容易吗？

I found:

browser.ActiveElement.Parent.OuterHtml

will not always get everything, as is dependant on your 'ActiveElement', Therefore:

browser.Body.Parent.OuterHtml

seems to work better. (browser being your instance of IE)

Though I believe Johan Levin is correct in saying the DOM is serialised back to text format.
Therefore would it not be easier to just fetch the document by it's URL (without using WatiN) to validate it.

回复收藏 0 原文

~没有更多了~