如何获取原始的innerHTML 源代码而不使用Javascript 生成的内容?

发布于 2024-10-06 19:07:43 字数 630 浏览 0 评论 0原文

是否有可能以某种方式获得原始 HTML 源代码,而不需要经过处理的 Javascript 进行更改?例如,如果我这样做:

<div id="test">
    <script type="text/javascript">document.write("hello");</script>
</div>

如果我这样做:

alert(document.getElementById('test').innerHTML);

它显示:

<script type="text/javascript">document.write("hello");</script>hello

简单来说,我希望 alert 仅显示:

<script type="text/javascript">document.write("hello");</script>

没有最终的hello (已处理的脚本)。

Is it possible to get in some way the original HTML source without the changes made by the processed Javascript? For example, if I do:

<div id="test">
    <script type="text/javascript">document.write("hello");</script>
</div>

If I do:

alert(document.getElementById('test').innerHTML);

it shows:

<script type="text/javascript">document.write("hello");</script>hello

In simple terms, I would like the alert to show only:

<script type="text/javascript">document.write("hello");</script>

without the final hello (the result of the processed script).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

一口甜 2024-10-13 19:07:43

我认为没有一个简单的解决方案可以“获取原始源”,因为它必须是浏览器提供的东西。但是,如果您只想对页面的一部分执行此操作,那么我有一个解决方法。

您可以将感兴趣的部分包装在“冻结”脚本中:

type 属性是我刚刚编写的,但它会强制浏览器忽略其中的所有内容。然后,您可以在这个脚本标签之后立即添加另一个脚本标签(这次是正确的 javascript)——“解冻”脚本。此解冻脚本将通过 ID 获取冻结的脚本,抓取其中的文本,然后执行 document.write 将实际内容添加到页面。每当您需要原始源时,它仍然会被捕获为冻结脚本内的文本。

现在你就得到了它。缺点是我不会在整个页面上使用它...(SEO、语法突出显示、性能...),但如果您对页面的一部分有特殊要求,这是完全可以接受的。


编辑:这是一些示例代码。此外,正如 @FlashXSFX 正确指出的那样,冻结脚本中的任何脚本标签都需要转义。因此,在这个简单的示例中,我将为此目的创建一个 标记。

<script id="frozen" type="text/x-frozen-html">
   <div id="test">
      <x-script type="text/javascript">document.write("hello");</x-script>
   </div>
</script>
<script type="text/javascript">
   // Grab contents of frozen script and replace `x-script` with `script`
   function getSource() {
      return document.getElementById("frozen")
         .innerHTML.replace(/x-script/gi, "script");
   }
   // Write it to the document so it actually executes
   document.write(getSource());
</script>

现在,每当您需要源代码时:

alert(getSource());

请参阅演示:http://jsbin.com/uyica3/edit

I don't think there's a simple solution to just "grab original source" as it'll have to be something that's supplied by the browser. But, if you are only interested in doing this for a section of the page, then I have a workaround for you.

You can wrap the section of interest inside a "frozen" script:

<script id="frozen" type="text/x-frozen-html">

The type attribute I just made up, but it will force the browser to ignore everything inside it. You then add another script tag (proper javascript this time) immediately after this one - the "thawing" script. This thawing script will get the frozen script by ID, grab the text inside it, and do a document.write to add the actual contents to the page. Whenever you need the original source, it's still captured as text inside the frozen script.

And there you have it. The downside is that I wouldn't use this for the whole page... (SEO, syntax highlighting, performance...) but it's quite acceptable if you have a special requirement on part of a page.


Edit: Here is some sample code. Also, as @FlashXSFX correctly pointed out, any script tags within the frozen script will need to be escaped. So in this simple example, I'll make up a <x-script> tag for this purpose.

<script id="frozen" type="text/x-frozen-html">
   <div id="test">
      <x-script type="text/javascript">document.write("hello");</x-script>
   </div>
</script>
<script type="text/javascript">
   // Grab contents of frozen script and replace `x-script` with `script`
   function getSource() {
      return document.getElementById("frozen")
         .innerHTML.replace(/x-script/gi, "script");
   }
   // Write it to the document so it actually executes
   document.write(getSource());
</script>

Now whenever you need the source:

alert(getSource());

See the demo: http://jsbin.com/uyica3/edit

倦话 2024-10-13 19:07:43

一个简单的方法是再次从服务器获取它。它很可能在缓存中。这是我使用 jQuery.get() 的解决方案。它获取页面的原始 uri,并通过 ajax 调用加载数据:

$.get(document.location.href, function(data,status,jq) {console.log(data);})

这将打印原始代码,无需任何 javascript。它不做任何错误处理!

如果不想使用 jQuery 来获取源,请参阅此问题的答案: 如何在没有jquery的情况下进行ajax调用?

A simple way is to fetch it form the server again. It will be in the cache most probably. Here is my solution using jQuery.get(). It takes the original uri of the page and loads the data with an ajax call:

$.get(document.location.href, function(data,status,jq) {console.log(data);})

This will print the original code without any javascript. It does not do any error handling!

If don't want to use jQuery to fetch the source, consult the answer to this question: How to make an ajax call without jquery?

追风人 2024-10-13 19:07:43

您能否将 Ajax 请求发送到当前所在的同一页面并将结果用作原始 HTML?在适当的条件下,这是万无一失的,因为您实际上获得的是原始 HTML 文档。但是,如果页面根据每个请求(具有动态内容)而更改,或者由于某种原因您无法向该特定页面发出请求,则此方法将不起作用。

Could you send an Ajax request to the same page you're currently on and use the result as your original HTML? This is foolproof given the right conditions, since you are literally getting the original HTML document. However, this won't work if the page changes on every request (with dynamic content), or if, for whatever reason, you cannot make a request to that specific page.

梦里梦着梦中梦 2024-10-13 19:07:43

蛮力方法

var orig = document.getElementById("test").innerHTML;
alert(orig.replace(/<\/script>[.\n\r]*.*/i,"</script>"));

编辑:

这可能会更好

var orig = document.getElementById("test").innerHTML + "<<>>";
alert(orig.replace( /<\/script>[^(<<>>)]+<<>>/i, "<\/script>"));

Brute force approach

var orig = document.getElementById("test").innerHTML;
alert(orig.replace(/<\/script>[.\n\r]*.*/i,"</script>"));

EDIT:

This could be better

var orig = document.getElementById("test").innerHTML + "<<>>";
alert(orig.replace( /<\/script>[^(<<>>)]+<<>>/i, "<\/script>"));
后eg是否自 2024-10-13 19:07:43

如果您覆盖 document.write 以在脚本写入文档的所有内容的开头和结尾添加一些标识符,您将能够使用正则表达式删除这些写入。

这是我想到的:

    <script type="text/javascript" language="javascript">
        var docWrite = document.write;
        document.write = myDocWrite;

        function myDocWrite(wrt) {
            docWrite.apply(document, ['<!--docwrite-->' + wrt + '<!--/docwrite-->']);
        }
    </script>

在初始脚本之后的页面中的某个位置添加了您的示例:

    <div id="test">
        <script type="text/javascript">     document.write("hello");</script>
    </div>

然后我用它来提醒里面的内容:

    var regEx = /<!--docwrite-->(.*?)<!--\/docwrite-->/gm;
    alert(document.getElementById('test').innerHTML.replace(regEx, ''));

If you override document.write to add some identifiers at the beginning and end of everything written to the document by the script, you will be able to remove those writes with a regular expression.

Here's what I came up with:

    <script type="text/javascript" language="javascript">
        var docWrite = document.write;
        document.write = myDocWrite;

        function myDocWrite(wrt) {
            docWrite.apply(document, ['<!--docwrite-->' + wrt + '<!--/docwrite-->']);
        }
    </script>

Added your example somewhere in the page after the initial script:

    <div id="test">
        <script type="text/javascript">     document.write("hello");</script>
    </div>

Then I used this to alert what was inside:

    var regEx = /<!--docwrite-->(.*?)<!--\/docwrite-->/gm;
    alert(document.getElementById('test').innerHTML.replace(regEx, ''));
浴红衣 2024-10-13 19:07:43

如果您想要原始文档,则需要再次获取它。没有办法解决这个问题。如果不是 document.write() (或在加载过程中运行的类似代码),您可以在修改之前在 load/domready 上将原始文档的 innerHTML 加载到内存中。

If you want the pristine document, you'll need to fetch it again. There's no way around that. If it weren't for the document.write() (or similar code that would run during the load process) you could load the original document's innerHTML into memory on load/domready, before you modify it.

眼波传意 2024-10-13 19:07:43

我想不出一个可以按照您要求的方式工作的解决方案。 Javascript 唯一可以访问的代码是通过 DOM,它只包含页面处理后的结果。

我能想到的最接近实现你想要的就是使用 Ajax 将页面的原始 HTML 的新副本下载到 Javascript 字符串中,此时由于它是一个字符串,你可以用它做任何你喜欢的事情,包括显示将其放入警报框中。

I can't think of a solution that would work the way you're asking. The only code that Javascript has access to is via the DOM, which only contains the result after the page has been processed.

The closest I can think of to achieve what you want is to use Ajax to download a fresh copy of the raw HTML for your page into a Javascript string, at which point since it's a string you can do whatever you like with it, including displaying it in an alert box.

羁〃客ぐ 2024-10-13 19:07:43

一个棘手的方法是使用

console.log(document.getElementById('test').innerHTML);
<style id="test" type="text/html+template">
    <script type="text/javascript">document.write("hello");</script>
</style>

但我不喜欢这种丑陋的解决方案。

A tricky way is using <style> tag for template. So that you do not need rename x-script any more.

console.log(document.getElementById('test').innerHTML);
<style id="test" type="text/html+template">
    <script type="text/javascript">document.write("hello");</script>
</style>

But I do not like this ugly solution.

吃颗糖壮壮胆 2024-10-13 19:07:43

我认为你想遍历 DOM 节点:

var childNodes = document.getElementById('test').childNodes, i, output = [];

for (i = 0; i < childNodes.length; i++)
    if (childNodes[i].nodeName == "SCRIPT")
        output.push(childNodes[i].innerHTML);

return output.join('');

I think you want to traverse the DOM nodes:

var childNodes = document.getElementById('test').childNodes, i, output = [];

for (i = 0; i < childNodes.length; i++)
    if (childNodes[i].nodeName == "SCRIPT")
        output.push(childNodes[i].innerHTML);

return output.join('');
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文