MSHTML COM 单击提交按钮时出现问题
我在使用 MSHTML COM 从此网站截屏某些数据时遇到问题成分。我的 WPF 表单上有一个 WebBrowser
控件。 我检索 HMTL 元素的代码位于 WebBrowser LoadCompleted
事件中。当我将数据值设置为 HTMLInputElement
并调用 HTMLInputButtonElement
上的 click 方法后,它拒绝提交请求并显示下一页。
我分析了按钮上 onclick 属性的 HTML,它实际上调用了一个 JavaScript 函数并处理我的请求。这让我不确定调用 JavaScript 函数是否导致了问题?但有趣的是,当我从 LoadCompleted 方法中取出代码并将其放入按钮单击事件中时,它实际上将我带到下一页,而 LoadCompleted 方法没有这样做。不做。这样做就失去了尝试自动抓取页面的意义。
另一种想法:当我在 LoadCompleted 方法中包含代码时,我认为 HTMLInputButtonElement 未完全呈现到页面上,导致单击事件未触发,尽管事实上,当我在运行时查看该对象时,它实际上保留了提交按钮元素,并且状态显示我已完成,这让我更加困惑。
以下是我在 LoadCompleted 方法和按钮上的 click 方法中使用的代码:
private void browser_LoadCompleted(object sender, NavigationEventArgs e)
{
HTMLDocument dom = (HTMLDocument)browser.Document;
IHTMLElementCollection elementCollection = dom.getElementsByName("PCL_NO_FROM.PARCEL_RANGE.XTRACKING.1-1-1.");
HTMLInputElement inputBox = null;
if (elementCollection.length > 0)
{
foreach (HTMLInputElement element in elementCollection)
{
if (element.name.Equals("PCL_NO_FROM.PARCEL_RANGE.XTRACKING.1-1-1."))
{
inputBox = element;
}
}
}
inputBox.value = "Test";
elementCollection = dom.getElementsByName("SUBMIT.DUM_CONTROLS.XTRACKING.1-1.");
HTMLInputButtonElement submitButton = null;
if (elementCollection.length > 0)
{
foreach (HTMLInputButtonElement element in elementCollection)
{
if (element.name.Equals("SUBMIT.DUM_CONTROLS.XTRACKING.1-1."))
{
submitButton = element;
}
}
}
submitButton.click();
}
仅供参考:这是我尝试使用 MSHTML 访问的网页的 URL, http://track.dhl.co.uk/tracking/wrd/运行/wt_xtrack_pw.entrypoint。
I'm having a problem screenscraping some data from this website using the MSHTML COM component. I have a WebBrowser
control on my WPF form.
The code where I retrieve the HMTL elements is in the WebBrowser LoadCompleted
events. After I set the values of the data to the HTMLInputElement
and call the click method on the HTMLInputButtonElement
, it is refusing to submit the the request and display the next page.
I analyse the HTML for the onclick attribute on the button, it is actually calling a JavaScript function and it processes my request. Which makes me not sure if calling the JavaScript function is causing the problem? But funny enough when I take my code out of the LoadCompleted
method and put it inside a button click event it actually takes me to the next page where as the LoadCompleted
method didn't do. Doing that sort of thing defeats the point of trying to screenscrape the page automatically.
On another thought: when I had the code inside the LoadCompleted
method, I'm thinking the HTMLInputButtonElement
is not fully rendered on to the page which result in click event not firing, despite the fact when I looked at the object in run time it is actually held the submit button element there and the state is saying I completed which baffles me even more.
Here is the code I used inside the LoadCompleted
method and the click method on the button:
private void browser_LoadCompleted(object sender, NavigationEventArgs e)
{
HTMLDocument dom = (HTMLDocument)browser.Document;
IHTMLElementCollection elementCollection = dom.getElementsByName("PCL_NO_FROM.PARCEL_RANGE.XTRACKING.1-1-1.");
HTMLInputElement inputBox = null;
if (elementCollection.length > 0)
{
foreach (HTMLInputElement element in elementCollection)
{
if (element.name.Equals("PCL_NO_FROM.PARCEL_RANGE.XTRACKING.1-1-1."))
{
inputBox = element;
}
}
}
inputBox.value = "Test";
elementCollection = dom.getElementsByName("SUBMIT.DUM_CONTROLS.XTRACKING.1-1.");
HTMLInputButtonElement submitButton = null;
if (elementCollection.length > 0)
{
foreach (HTMLInputButtonElement element in elementCollection)
{
if (element.name.Equals("SUBMIT.DUM_CONTROLS.XTRACKING.1-1."))
{
submitButton = element;
}
}
}
submitButton.click();
}
FYI: This is the URL of the web page I'm trying to access using MSHTML,
http://track.dhl.co.uk/tracking/wrd/run/wt_xtrack_pw.entrypoint.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
有很多可能性:
您可以尝试将代码放在
其他事件,例如导航
已完成,或下载已完成。
您可能需要在 click() 函数之后显式评估 OnClick 事件。
使用 MS WebBrowser 控件是
比使用 MSHTML COM 更容易。
There are many possibilities:
You may try to put your code at
other events, such as on Navigation
Completed, or on Download Completed.
You may need to explicitly evaluate the OnClick event after the click() function.
Using the MS WebBrowser control is
easier than using the MSHTML COM.
OnBeforeNavigate 中的延迟可能会导致单击操作失败。
我们注意到,对于某些提交操作,OnBeforeNavigate 会被调用两次,特别是在使用 onClick 的情况下。第一次调用是在 onClick 操作执行之前,第二次是在 onClick 操作完成之后。
关闭 BHO,在 onClick 上设置断点,跳过提交操作
return jsSubmit()
,然后稍等一下,您应该能够在没有自动化的情况下导致相同的问题。第二次调用 OnBeforeNavigate 时任何超过 150 毫秒的延迟都会导致页面加载/导航到结果时出现一些失败。
编辑:
在尝试了我们自己的 DHL 页面自动化之后,我们目前对上述时间安排没有任何问题。
Delay in OnBeforeNavigate can cause click actions to fail.
We have noticed that with some submit actions OnBeforeNavigate is called twice, especially where onClick is used. The first call is before the onClick action is performed, the second is after it is complete.
Turn off your BHO, put a breakpoint on onClick, step over the submit action
return jsSubmit()
and then wait a bit and you should be able to cause the same issue without your automation.Any delay >150ms on the second call to OnBeforeNavigate causes some failure in page load/navigation to the result.
Edit:
Having tried our own automation of this DHL page we don't currently have an issue with the timing described above.