Javascript从另一个网站的源代码中查找html元素的问题
我无法从所选页面的下载源代码中查找单个 html 元素。当我使用函数 $(data).find('p').length
时,它返回数字 2,这是正确的答案,但如果我使用函数 $(data) .find('img').length
它返回0,它应该是1。
async function getErrors() {
await $.ajax({
url: 'http://example.com',
method: 'get'
})
.done(async (siteText) => {
var data = $.parseHTML(siteText);
console.log(data);
console.log($(data).find('p').length);
console.log($(data).find('img').length);
await axios.get('http://anothersite.com')
.then((response) => {
//do something...
});
});
}
实例:
var siteText = `<!DOCTYPE html>
<html lang="pl">
<head>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Test Site</title>
<style>
.black{
background-color: black;
color: #333131;
}
</style>
</head>
<body>
<h1>Strona Testowa</h1>
<div>
<h2>Lorem Ipsum</h2>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Convallis aenean et tortor at risus. Pellentesque habitant morbi tristique senectus. Nisi est sit amet facilisis. Vel elit scelerisque mauris pellentesque pulvinar. Quisque egestas diam in arcu. Elit at imperdiet dui accumsan sit amet nulla. Urna porttitor rhoncus dolor purus non enim praesent elementum. Velit dignissim sodales ut eu sem integer vitae justo eget. Lacus suspendisse faucibus interdum posuere lorem. Et ultrices neque ornare aenean euismod. Porttitor eget dolor morbi non. Sit amet consectetur adipiscing elit. Amet nisl suscipit adipiscing bibendum est. Eu non diam phasellus vestibulum. Neque convallis a cras semper auctor. Risus at ultrices mi tempus imperdiet nulla malesuada pellentesque elit. Et molestie ac feugiat sed lectus vestibulum. Adipiscing diam donec adipiscing tristique risus nec. Imperdiet proin fermentum leo vel. Nibh mauris cursus mattis molestie a iaculis at erat pellentesque. Elementum integer enim neque volutpat ac tincidunt vitae semper. Nam libero justo laoreet sit. Nibh tortor id aliquet lectus proin nibh nisl condimentum id. Et sollicitudin ac orci phasellus egestas tellus. Nunc sed augue lacus viverra vitae congue eu. Dui vivamus arcu felis bibendum ut. Mattis nunc sed blandit libero volutpat sed. Commodo sed egestas egestas fringilla phasellus faucibus scelerisque eleifend. Velit aliquet sagittis id consectetur purus ut faucibus pulvinar elementum. Quam vulputate dignissim suspendisse in est ante in nibh. Accumsan sit amet nulla facilisi morbi. Ac ut consequat semper viverra. Viverra tellus in hac habitasse platea dictumst. Donec ultrices tincidunt arcu non sodales neque. In est ante in nibh mauris. Mattis enim ut tellus elementum sagittis. Consectetur adipiscing elit pellentesque habitant morbi tristique senectus et netus. Sed id semper risus in. Vestibulum lectus mauris ultrices eros in cursus turpis massa. Vitae tempus quam pellentesque nec nam aliquam sem et tortor. In arcu cursus euismod quis viverra nibh cras. Sit amet consectetur adipiscing elit duis tristique. Augue ut lectus arcu bibendum at varius vel pharetra vel. Pharetra magna ac placerat vestibulum lectus mauris ultrices eros in. Libero nunc consequat interdum varius sit amet mattis vulputate. Netus et malesuada fames ac. In pellentesque massa placerat duis ultricies lacus sed turpis tincidunt. Tellus in hac habitasse platea dictumst vestibulum rhoncus est pellentesque. Duis convallis convallis tellus id interdum velit laoreet. Et tortor consequat id porta nibh venenatis cras. Laoreet sit amet cursus sit amet dictum sit amet justo.</p>
</div>
<img src="https://png.pngtree.com/png-clipart/20190108/ourmid/pngtree-tree-green-plant-photography-png-png-image_305004.jpg" >
<iframe width="560" height="315" src="https://www.youtube.com/embed/gK8s4LUJ7NE" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
<div class="black">
<p class="black">Lorem Ipsum</p>
</div>
</body>
</html>`;
var data = $.parseHTML(siteText);
console.log(data);
console.log($(data).find('p').length);
console.log($(data).find('img').length);
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
I am having trouble finding individual html elements from the downloaded source code of a selected page. When I use the function $(data).find('p').length
it returns me the number 2 which is the correct answer, but if I use the function $(data).find('img').length
it returns me 0 and it should be 1.
async function getErrors() {
await $.ajax({
url: 'http://example.com',
method: 'get'
})
.done(async (siteText) => {
var data = $.parseHTML(siteText);
console.log(data);
console.log($(data).find('p').length);
console.log($(data).find('img').length);
await axios.get('http://anothersite.com')
.then((response) => {
//do something...
});
});
}
Live example:
var siteText = `<!DOCTYPE html>
<html lang="pl">
<head>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Test Site</title>
<style>
.black{
background-color: black;
color: #333131;
}
</style>
</head>
<body>
<h1>Strona Testowa</h1>
<div>
<h2>Lorem Ipsum</h2>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Convallis aenean et tortor at risus. Pellentesque habitant morbi tristique senectus. Nisi est sit amet facilisis. Vel elit scelerisque mauris pellentesque pulvinar. Quisque egestas diam in arcu. Elit at imperdiet dui accumsan sit amet nulla. Urna porttitor rhoncus dolor purus non enim praesent elementum. Velit dignissim sodales ut eu sem integer vitae justo eget. Lacus suspendisse faucibus interdum posuere lorem. Et ultrices neque ornare aenean euismod. Porttitor eget dolor morbi non. Sit amet consectetur adipiscing elit. Amet nisl suscipit adipiscing bibendum est. Eu non diam phasellus vestibulum. Neque convallis a cras semper auctor. Risus at ultrices mi tempus imperdiet nulla malesuada pellentesque elit. Et molestie ac feugiat sed lectus vestibulum. Adipiscing diam donec adipiscing tristique risus nec. Imperdiet proin fermentum leo vel. Nibh mauris cursus mattis molestie a iaculis at erat pellentesque. Elementum integer enim neque volutpat ac tincidunt vitae semper. Nam libero justo laoreet sit. Nibh tortor id aliquet lectus proin nibh nisl condimentum id. Et sollicitudin ac orci phasellus egestas tellus. Nunc sed augue lacus viverra vitae congue eu. Dui vivamus arcu felis bibendum ut. Mattis nunc sed blandit libero volutpat sed. Commodo sed egestas egestas fringilla phasellus faucibus scelerisque eleifend. Velit aliquet sagittis id consectetur purus ut faucibus pulvinar elementum. Quam vulputate dignissim suspendisse in est ante in nibh. Accumsan sit amet nulla facilisi morbi. Ac ut consequat semper viverra. Viverra tellus in hac habitasse platea dictumst. Donec ultrices tincidunt arcu non sodales neque. In est ante in nibh mauris. Mattis enim ut tellus elementum sagittis. Consectetur adipiscing elit pellentesque habitant morbi tristique senectus et netus. Sed id semper risus in. Vestibulum lectus mauris ultrices eros in cursus turpis massa. Vitae tempus quam pellentesque nec nam aliquam sem et tortor. In arcu cursus euismod quis viverra nibh cras. Sit amet consectetur adipiscing elit duis tristique. Augue ut lectus arcu bibendum at varius vel pharetra vel. Pharetra magna ac placerat vestibulum lectus mauris ultrices eros in. Libero nunc consequat interdum varius sit amet mattis vulputate. Netus et malesuada fames ac. In pellentesque massa placerat duis ultricies lacus sed turpis tincidunt. Tellus in hac habitasse platea dictumst vestibulum rhoncus est pellentesque. Duis convallis convallis tellus id interdum velit laoreet. Et tortor consequat id porta nibh venenatis cras. Laoreet sit amet cursus sit amet dictum sit amet justo.</p>
</div>
<img src="https://png.pngtree.com/png-clipart/20190108/ourmid/pngtree-tree-green-plant-photography-png-png-image_305004.jpg" >
<iframe width="560" height="315" src="https://www.youtube.com/embed/gK8s4LUJ7NE" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
<div class="black">
<p class="black">Lorem Ipsum</p>
</div>
</body>
</html>`;
var data = $.parseHTML(siteText);
console.log(data);
console.log($(data).find('p').length);
console.log($(data).find('img').length);
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
作为替代方案,您可以在新创建的元素上使用
html()
函数来解析 HTML。这样find()
函数就可以工作,因为它会查找新元素的子元素。详细说明:
parseHTML()
和html()
解析的HTML会忽略,
和
标签。
因此,解析会返回 head 和 body 中的节点数组,因此当直接包装在 jQuery 对象中时,find() 函数会在该数组中的每个元素上运行。这就是为什么
find()
无法找到的直接子级。
filter()
函数之所以起作用是因为它过滤了数组。通过将结果包装在新元素中,查找函数将在完整的
内容上正确工作,因为它们现在是新元素的子元素。
As an alternative you could use the
html()
function on a newly created element to parse your HTML. This way thefind()
function works because it looks for child elements of the new element.Detailed explenation:
HTML parsed by
parseHTML()
andhtml()
will ignore<html>
,<head>
and<body>
tags.So the parsing returns an array of the nodes in the head and body so the
find()
function runs on every element in that array when wrapped in a jQuery object directly. That's whyfind()
can't find the direct children of<body>
. Thefilter()
function works because it filters the array.By wrapping the result in a new element the find function will work correctly on the full
<body>
content since they are now children of the new element.我在另一个网站上尝试使用您的代码,效果很好。我修改了你的 JS 以暂时摆脱 async/await:
I tried with your code with another site and that's working fine. I modified your JS to temporary get rid of async/await: