从矩形中抓取 HTML

发布于 2024-12-14 18:13:05 字数 897 浏览 1 评论 0原文

我想做的是允许用户在网站顶部绘制一个矩形，并抓取他在该矩形中看到的所有 html。

我知道这不可能完美完成，但我想知道它能做得如何。

我正在考虑做这样的事情

function getTagsInArea(p1, p2){
    var ret = {}
    for(x=p1.x;x<p2.x;x+=10){
        for(y=p1.y;y<p2.y;y+=10){
            var el = document.elementFromPoint(x,y);
            if(typeof ret[el] =='undefined'){
                ret[el]=el;
            }
            else{console.log('not appending '+el);}
        } 
    } 
    return ret;
}

这或多或少地为我提供了该区域的标签。我想知道是否有一种通用方法可以从这些标签构建树并输出 html。

我正在寻找类似 DocumentFragment 的东西。例如此代码片段中的选择内容：

var range = window.getSelection().getRangeAt(0);
var selectionContents = range.extractContents();

是否有明显的方法可以做到这一点？到目前为止的问题之一是，我使用上述函数获得的一些标签是“body”和“div id =“page””内容，其中包含我正在寻找的内容。任何解决方案都需要弄清楚如何仅获取周围标签中需要的部分。

例如，如果我有一个很长的段落并重新排列了其中的一半，我只想返回我选择的文本。

希望这个问题有意义。

原文

What i would like to do is to allow the user to draw a rectangle on top of a website, and grabbing all the html he sees in that rectangle.

I know this can't be done perfectly but I was wondering how well it could be done.

i was thinking of doing something like this

function getTagsInArea(p1, p2){
    var ret = {}
    for(x=p1.x;x<p2.x;x+=10){
        for(y=p1.y;y<p2.y;y+=10){
            var el = document.elementFromPoint(x,y);
            if(typeof ret[el] =='undefined'){
                ret[el]=el;
            }
            else{console.log('not appending '+el);}
        } 
    } 
    return ret;
}

This gives me more or less the tags in that area. I wonder if there is a generic way to build trees from these tags and output html.

I am looking for something like a DocumentFragment. Such as selectionContents from this snippet:

var range = window.getSelection().getRangeAt(0);
var selectionContents = range.extractContents();

Is there an obvious way to do this?
One of the problems so far is that some of the tags i get using the above func are 'body' and 'div id="page"' stuff, which contain what i am looking for. Any solution would need to figure out how to take only those parts of the surrounding tags taht are needed.

For instance, if i have a long paragraph and recntagled half of it, i want only the text in my selection to be returned.

Hope this question makes sense.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

世界如花海般美丽 2024-12-21 18:13:17

如果您知道三角形的面积，则可以使用跨度标签包围文本的所有字符，例如 this然后检查这些跨度标签是否落在您的矩形内。为了使它更容易，我会将类分配给这些跨度标签。然后，使用 $.each() 连接文本以循环跨度标记，检查它们是否落在矩形内，如果落在矩形内，则连接到字符串变量。另外，如果您使用 document.elementFromPoint()，它只会返回位于该位置且 z 索引最高的元素（如果您有一些分层元素）。

回复收藏 0 原文