解析 http Web 响应中的相关请求

发布于 2024-08-16 22:47:09 字数 587 浏览 7 评论 0原文

我想模拟 WebTestRequest 类（在 Visual Studio 的测试工具框架中）的行为，它可以根据从原始请求获取的响应中引用的资源来调用依赖请求。

例如，如果我发出一个 Web 请求并通过执行以下操作获取响应：

string url = "http://www.mysite.com";
WebRequest request = WebRequest.Create(url);
using (WebResponse response = request.GetResponse())
{
    StreamReader reader = new StreamReader(response.GetResponseStream()); 
    string responseText = reader.ReadToEnd();
}

我希望能够解析 responseText 并查看是否有任何对其他资源的请求（例如 js/css 文件，图像等）

有一个简单的方法可以做到这一点吗？我犹豫是否要手动执行此操作，因为某些资源请求可能是以编程方式设置的，并且在简单的文本解析中可能并不明显。

原文

I want to simulate the behaviour of the WebTestRequest class (in Visual Studio's Test Tools framework) where it can invoke dependent requests based on resources that are referred to in the response that is obtained from the original request.

For example, if I issue a web request and get the response by doing this:

string url = "http://www.mysite.com";
WebRequest request = WebRequest.Create(url);
using (WebResponse response = request.GetResponse())
{
    StreamReader reader = new StreamReader(response.GetResponseStream()); 
    string responseText = reader.ReadToEnd();
}

I would like to be able to parse responseText and see if there are any requests to other resources (like js/css files, images, etc.)

Is there an easy way of doing this? I hesitate to manually do this, as some of the resource requests may be set up programmatically and may not be obvious on a straightforward text parse.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

辞慾 2024-08-23 22:47:09

使用 html/sgml 解析器库。我不熟悉 Visual Studio，但是有一些用于解析 html 的框架。找到一个并在 API 中查找与查找元素相关的内容。

回复收藏 0 原文

赤濁 2024-08-23 22:47:09

我相当确定 WebTestRequest 本身只执行“直接文本解析”来确定依赖的请求，因为它不了解 javascript。因此，如果您要实现这样的功能，那么您的代码将准确地模拟该行为。

以下是我粗略浏览 HMTL 4 规范时可以找到的所有元素的列表，这些元素可以引用其他资源，因此需要进行解析：

<; area href=

不确定它是否详尽。

顺便说一句，我很好奇你最后做了什么。

编辑：

一些资源请求可以通过编程方式设置，并且在简单的文本解析中可能并不明显

事实上，在某些时候通过解析 html 响应来确定依赖请求是不可能的，我会给你一个例子：任何开发的东西与谷歌网络工具包。在我最近测试的一个 GWT 应用程序中，基本上没有可解析的 html —— 一切都是从 javascript 运行的。提取明显的路径名（如果可用）甚至没有用，因为实际上条件逻辑选择的是某些依赖项而不是其他依赖项。