如何将 WebResponse 中的 FORM 解析为 WebRequest 的 POST 正文

发布于 2024-11-28 04:03:56 字数 4543 浏览 1 评论 0原文

我对此很陌生,这是我的处女航,手头的任务是在 C# 中创建一个事务,该事务将通过 WebRequest/WebResponse 浏览 Web 应用程序的页面流。我让请求/响应机制、cookies 等都正常工作(我可以使用 POST URL 和 POST 正文的硬编码值成功执行事务),困难在于从 WebRequest 的值对生成 WebRequest 的动态 POST 正文和 POST URL 。 本质上,一旦流程从第一个 WebRequest(始终具有相同的静态 URL 和“硬编码”主体)开始,接下来的每个请求都是根据前一个响应的 FORM 值对构建的,例如:响应中的 FORM 的一部分(我已将 HTML 左括号和右括号替换为方括号,不知道如何将 HTML 直接粘贴到此处):

    <form id="expressform" method="post" action="">
<div>
    <input type="hidden" name="ScreenData.widgets.modified" value=""/><input type="hidden" name="ScreenData.header.hidden.name" value="ScreenData.widgets.modified"/><input type="hidden" name="ScreenData.marshalled" value="true"/><input type="hidden" name="ScreenData.header.hidden.name" value="ScreenData.marshalled"/><input type="hidden" name="isCreateAccountWizard" value="true"/><input type="hidden" name="ScreenData.header.hidden.name" value="isCreateAccountWizard"/>
    <input type="hidden" name="versionPoint" value="77777"/>

然后表单中的一些文本区域用于提交值,如下所示:

<tr>
    <td class="dataOut" style="padding-left:30px">
        <textarea name="ScreenData.sicInfo.natureOfBusiness" rows="5"  cols="60" class="dataOut" onmouseup="textAreaCounter(this,250);;" onkeypress="textAreaCounter(this,250);;" onkeyup="textAreaCounter(this,250);;" onchange="markDataDirty(this);;"></textarea> 
    </td>
</tr>

然后在“提交”上有 URL:

 <a class="detailBtnOn" href="javascript:submitForm('express?displayAction=CreateAccountWizard&amp;saveAction=SaveCreateSICCode&amp;flow=forward&amp;saveActionToken=84454A7D-50FE-5856-CE17-916B70EDFE1A&amp;flowToken=CF3827F4-1DE7-54B1-D87B-D72F01C454C3')">Submit</a>

然后下一个 WebResponse应该在其 POST 主体中包含此内容:

ScreenData.widgets.modified=&ScreenData.header.hidden.name=ScreenData.widgets.modified&ScreenData.marshalled=true&ScreenData.header.hidden.name=ScreenData.marshalled&isCreateAccountWizard=true&ScreenData.header.hidden.name=isCreateAccountWizard&versionPoint=77777&ScreenData.commonHeaderInfo.accountName=SomeAccountName&ScreenData.commonHeaderInfo.effectiveDate=08%2F01%2F2011&ScreenData.sicInfo.natureOfBusiness=business&ScreenData.sicInfo.sic=7777&ScreenData.widgets.modified=ScreenData.sicInfo.natureOfBusiness&ScreenData.widgets.modified=ScreenData.sicInfo.sic

并将其作为 URL:

express?displayAction=CreateAccountWizard&saveAction=SaveCreateSICCode&flow=forward&saveActionToken=84454A7D-50FE-5856-CE17-916B70EDFE1A&flowToken=CF3827F4-1DE7-54B1-D87B-D72F01C454C3 

但我不仅不知道如何构建此解析引擎,甚至无法从 FORM 中获取值对。我正在尝试使用 AgilityPack,这里至少应该打印出 FORMs“重要”内容:

var page = new HtmlDocument();
page.OptionReadEncoding = false;
var stream = HttpWResponse.GetResponseStream(); 
page.Load(stream);
foreach (var f in page.DocumentNode.Descendants("form"))
{
    foreach (var d in page.DocumentNode.Descendants("div"))
    {
        Loggers.EventsLogger.Info("");
        Loggers.EventsLogger.Info((f.GetAttributeValue("name", null) ?? f.GetAttributeValue("id", "<no name>")) + ": ");
        Loggers.EventsLogger.Info("");
        Loggers.EventsLogger.Info(f.GetAttributeValue("method", "<no method>") + ' ');
        Loggers.EventsLogger.Info("");
        Loggers.EventsLogger.Info(f.GetAttributeValue("action", "<no action>"));

        foreach(var i in f.Descendants("input"))//{

        {
            Loggers.EventsLogger.Info("");
            Loggers.EventsLogger.Info('\t' + (i.GetAttributeValue("name", null) ?? f.GetAttributeValue("id", "<no name>")));
            Loggers.EventsLogger.Info("");
            Loggers.EventsLogger.Info(" (");
            Loggers.EventsLogger.Info("");
            Loggers.EventsLogger.Info(i.GetAttributeValue("type", "<no type>"));
            Loggers.EventsLogger.Info("");
            Loggers.EventsLogger.Info("): " + i.GetAttributeValue("value", "<no value>"));
        }
        Loggers.EventsLogger.Info("");
        Loggers.EventsLogger.Info("");
    }
}

但它只打印出这个:(

INFO  EventsLogger - 
INFO  EventsLogger - expressform: 
INFO  EventsLogger - 
INFO  EventsLogger - post 

如果我去掉“div”位 - foreach(page.DocumentNode 中的 var d) .Descendants("div")), - 没有任何变化)


任何有关 FORM 打印输出解析器发生的情况以及如何构建解析引擎来构建来自响应的请求的解析引擎的帮助或建议都将非常有用赞赏。

I'm new to this, this is my virgin voyage, the task at hand is to create a transaction in C# that will navigate through a page flow of a web app via WebRequest/WebResponse. I got the Request/Response mechanism working, cookies and all (I can successfully execute a transaction with hardcoded values for POST URLs and POST bodies), the difficulty is with generating dynamic POST body and POST URL for the WebRequest from the value pairs of WebRequest.
Essentially, once the flow is started with first WebRequest, which has always the same static URL and "hardcoded" body, each following Request is built from the FORM value pairs of the previous Response, for example: part of the FORM that's in the Response (I've replaced HTML opening and closing brackets with square ones, not sure how to paste HTML straight into here):

    <form id="expressform" method="post" action="">
<div>
    <input type="hidden" name="ScreenData.widgets.modified" value=""/><input type="hidden" name="ScreenData.header.hidden.name" value="ScreenData.widgets.modified"/><input type="hidden" name="ScreenData.marshalled" value="true"/><input type="hidden" name="ScreenData.header.hidden.name" value="ScreenData.marshalled"/><input type="hidden" name="isCreateAccountWizard" value="true"/><input type="hidden" name="ScreenData.header.hidden.name" value="isCreateAccountWizard"/>
    <input type="hidden" name="versionPoint" value="77777"/>

and then some text areas in the form to submit values, like this:

<tr>
    <td class="dataOut" style="padding-left:30px">
        <textarea name="ScreenData.sicInfo.natureOfBusiness" rows="5"  cols="60" class="dataOut" onmouseup="textAreaCounter(this,250);;" onkeypress="textAreaCounter(this,250);;" onkeyup="textAreaCounter(this,250);;" onchange="markDataDirty(this);;"></textarea> 
    </td>
</tr>

and then on Submit there's the URL:

 <a class="detailBtnOn" href="javascript:submitForm('express?displayAction=CreateAccountWizard&saveAction=SaveCreateSICCode&flow=forward&saveActionToken=84454A7D-50FE-5856-CE17-916B70EDFE1A&flowToken=CF3827F4-1DE7-54B1-D87B-D72F01C454C3')">Submit</a>

And then the next WebResponse should have this in its POST body:

ScreenData.widgets.modified=&ScreenData.header.hidden.name=ScreenData.widgets.modified&ScreenData.marshalled=true&ScreenData.header.hidden.name=ScreenData.marshalled&isCreateAccountWizard=true&ScreenData.header.hidden.name=isCreateAccountWizard&versionPoint=77777&ScreenData.commonHeaderInfo.accountName=SomeAccountName&ScreenData.commonHeaderInfo.effectiveDate=08%2F01%2F2011&ScreenData.sicInfo.natureOfBusiness=business&ScreenData.sicInfo.sic=7777&ScreenData.widgets.modified=ScreenData.sicInfo.natureOfBusiness&ScreenData.widgets.modified=ScreenData.sicInfo.sic

and this as a URL:

express?displayAction=CreateAccountWizard&saveAction=SaveCreateSICCode&flow=forward&saveActionToken=84454A7D-50FE-5856-CE17-916B70EDFE1A&flowToken=CF3827F4-1DE7-54B1-D87B-D72F01C454C3 

But not only I can't figure out how to build this parsing engine, I can't even grab value pairs from the FORM. I'm trying to use AgilityPack, here's a bit that should at least print out FORMs "important" content:

var page = new HtmlDocument();
page.OptionReadEncoding = false;
var stream = HttpWResponse.GetResponseStream(); 
page.Load(stream);
foreach (var f in page.DocumentNode.Descendants("form"))
{
    foreach (var d in page.DocumentNode.Descendants("div"))
    {
        Loggers.EventsLogger.Info("");
        Loggers.EventsLogger.Info((f.GetAttributeValue("name", null) ?? f.GetAttributeValue("id", "<no name>")) + ": ");
        Loggers.EventsLogger.Info("");
        Loggers.EventsLogger.Info(f.GetAttributeValue("method", "<no method>") + ' ');
        Loggers.EventsLogger.Info("");
        Loggers.EventsLogger.Info(f.GetAttributeValue("action", "<no action>"));

        foreach(var i in f.Descendants("input"))//{

        {
            Loggers.EventsLogger.Info("");
            Loggers.EventsLogger.Info('\t' + (i.GetAttributeValue("name", null) ?? f.GetAttributeValue("id", "<no name>")));
            Loggers.EventsLogger.Info("");
            Loggers.EventsLogger.Info(" (");
            Loggers.EventsLogger.Info("");
            Loggers.EventsLogger.Info(i.GetAttributeValue("type", "<no type>"));
            Loggers.EventsLogger.Info("");
            Loggers.EventsLogger.Info("): " + i.GetAttributeValue("value", "<no value>"));
        }
        Loggers.EventsLogger.Info("");
        Loggers.EventsLogger.Info("");
    }
}

but it only prints out this:

INFO  EventsLogger - 
INFO  EventsLogger - expressform: 
INFO  EventsLogger - 
INFO  EventsLogger - post 

(if i get rid of the "div" bit - foreach (var d in page.DocumentNode.Descendants("div")), - nothing changes)


Any help or suggestions on what's going on with the FORM print out parser and how to build a parsing engine for building Requests from Responses would be greatly appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

忘东忘西忘不掉你 2024-12-05 04:03:56

看看这个 使用 HtmlAgilityPack 解析 HTML 页面 和这个 http://refactoringaspnet.blogspot.com/2010/ 04/using-htmlagilitypack-to-get-and-post_19.htmlhttp://htmlagilitypack.codeplex.com/discussions/247206如何使用 HtmlAgility Pack 从特定表单获取输入? Lang:C#.net

编辑 - 更多信息:

您通过 foreach 遍历 HTML 文档中的表单,但您在下一个 foreach 中的 DIV 后面查找而不引用当前表单...在内部 foreach 循环中( s)你需要类似的东西

foreach (var d in f.SelectNodes(".//div"))

foreach (var i in d.SelectNodes(".//input"))

check this out Parsing HTML page with HtmlAgilityPack and this http://refactoringaspnet.blogspot.com/2010/04/using-htmlagilitypack-to-get-and-post_19.html and http://htmlagilitypack.codeplex.com/discussions/247206 and How would I get the inputs from a certain form with HtmlAgility Pack? Lang: C#.net

EDIT - some more info:

you loop via foreach over the forms in the HTML document but you go after the DIVs in the next foreach without referencing the current form... in the inner foreach loop(s) you need something similar to

foreach (var d in f.SelectNodes(".//div"))

and

foreach (var i in d.SelectNodes(".//input"))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文