无法获取 Opera 中某些站点主页的文档对象
我们正在构建一个 Opera 扩展,其中我们尝试使用文档对象属性 (document.body.innerHTML) 来获取站点主页的源代码。在大多数情况下,它为我们提供正确的页面源,但对于某些站点(具有多个文档层的站点),它不会返回最顶层的文档。
例如,对于网站 (https://www.pcisecuritystandards.org/),我们希望获取显示的主页,但 加载整个页面后,文档属性将显示另一层的源(https://s7.addthis.com/static/r07/sh29.html#cb=0&ab=-&dh=www.pcisecuritystandards .org&dr=&du=https%3A%2F%2Fwww.pcisecuritystandards.org%2F&dt=Official%20PCI%20Security%20Standards%20Council%20Site%20-%20Verify%20PCI%20Compliance%2C%20Download% 20Data%20Security%20和%20Credit%20Card%20Security%20Standards&inst=1&lng=en&pc=men&pub=&ssl=1&sid=4d2ee1f94278e71b&srd=1&srf=0.02&srp=0.2 &srx=0&ver=250&xck=0&rev=86981&xd=1)
这可能与 opera 在页面中加载文档层的方式有关。我们在使用任何其他浏览器时都没有遇到任何问题
我们如何使用 Opera 中的文档对象获取主页 (https://www.pcisecuritystandards.org/) 的源代码?
We are building an opera extension in which we are trying to use the document object property (document.body.innerHTML) in order to obtain the source of the main page of a site. In most cases it provides us with the correct page source but for certain sites (ones that have multiple document layers), it doesn't return the top most document.
For instance, for the site (https://www.pcisecuritystandards.org/) we would like to source for the main page that is displayed but
once the entire page is loaded, the document property would display the source for another layer (https://s7.addthis.com/static/r07/sh29.html#cb=0&ab=-&dh=www.pcisecuritystandards.org&dr=&du=https%3A%2F%2Fwww.pcisecuritystandards.org%2F&dt=Official%20PCI%20Security%20Standards%20Council%20Site%20-%20Verify%20PCI%20Compliance%2C%20Download%20Data%20Security%20and%20Credit%20Card%20Security%20Standards&inst=1&lng=en&pc=men&pub=&ssl=1&sid=4d2ee1f94278e71b&srd=1&srf=0.02&srp=0.2&srx=0&ver=250&xck=0&rev=86981&xd=1)
This perhaps has to do with how opera loads the document layers in a page. We did not face any issue with any other browser
How can we obtain the source of the main page (https://www.pcisecuritystandards.org/) using the document object in Opera ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
看来您可能位于
中。注入的脚本将被注入到顶部文档以及页面上的任何 iframe 中,因此只需在获取
innerHTML
之前检查一下您是否位于最顶部的窗口中。类似于if (window.self == window.top)
。It looks like you're possibly in an
<iframe>
. Injected scripts will get injected into the top document as well as any iframe on the page, so just do a check that you're in the top-most window before grabbing theinnerHTML
. Something likeif (window.self == window.top)
.您是否尝试使用 dragonfly 了解 Opera 发生的情况很有用。现在我正在使用 Opera 11,当我进入你提到的页面时。我在innerHTML 中得到了一些东西。
这样我们就可以看到新闻部分,这样我就可以返回主页。请注意,一些标记已通过脚本注入。您可以粘贴您正在使用的代码片段吗?
Did you try to use dragonfly it is useful to understand what is happening with Opera. Right now I'm using Opera 11 and when I go to the page you mentionned. I got a few things in innerHTML.
So we can see the news section so I get back indeed the home page. Note that some markup has been injected through scripts. Could you paste the piece of code you are using.
我们检查并发现,在 Opera 的情况下,无论我们尝试使用什么文档或窗口属性,我们都无法获取对最顶层文档对象的引用。事实上,我们尝试过使用 window.top 和 window.parent,但似乎没有任何东西可以提供 window 对象的句柄,并且这些属性返回 null。
关于“注入的脚本将被注入到顶部文档以及页面上的任何 iframe”,我们仅从 iFrame 而不是从最顶部的窗口收到警报(我们将其包含在注入的脚本中)。我们发现您所说的仅在 Google Chrome 和 Safari 中属实。
We had checked and found that in case of opera, we are not obtaining reference to the top most document object no matter what document or window property we tried to use. In fact We had tried using window.top and window.parent but nothing seemed to provide a handle on the window object and these properties returned null.
Regarding this "Injected scripts will get injected into the top document as well as any iframe on the page", we got an alert (which we included in the injected script) only from the iFrame but not from the top most window. we found what you said to be true only in Google Chrome and Safari.