使用Cheerio库从网页中提取数据

发布于 2025-02-08 10:28:18 字数 829 浏览 2 评论 0原文

我正在尝试使用 Cheerio 和Google Apps脚本。 我想从此 webpage

”在此处输入图像描述“


以下是我用来获取它的代码段:

function LinkResult(){

  var url ='https://pagespeed.web.dev/report?url=http%3A%2F%2Fwww.juicecoldpressed.com%2F';

  var result = UrlFetchApp.fetch(url);
  var content = Cheerio.load(result.getContentText())
  var item = content(".lh-gauge__percentage").text()

  Logger.log(item)
  
}

当我运行时,此代码不会在变量中显示任何输出项目。当然有我缺少的东西,你能指导我吗?谢谢。

I am trying to scrape very small information from a webpage using Cheerio and Google Apps Script.
I want to get the performance number from this webpage:

enter image description here

Following is the code snippet that I am using to get it:

function LinkResult(){

  var url ='https://pagespeed.web.dev/report?url=http%3A%2F%2Fwww.juicecoldpressed.com%2F';

  var result = UrlFetchApp.fetch(url);
  var content = Cheerio.load(result.getContentText())
  var item = content(".lh-gauge__percentage").text()

  Logger.log(item)
  
}

As I run, this code does not show any output in the variable item. Surely there is something which I am missing, can you please guide me? Thank you.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

宫墨修音 2025-02-15 10:28:18

问题和解决方法:

在这种情况下,我担心您的目标可能无法使用https://pagespeed.web.dev/report?url=http%3A%2F%直接实现。 2fwww.juicecoldpressed.com%2FCheerio。因为从urlfetchapp.fetch(url)检索的HTML数据与浏览器上的数据不同。而且,似乎该值是使用脚本计算的。

幸运的是,在您的情况下,我认为可以使用PagesPeed Insights API检索您的价值。在此答案中,我想建议使用PagesPeed Insights API实现您的目标。

用法:

1。开始使用PagesPeed Insights API。

请检查官方文档用于使用PagesPeed Insights api。在这种情况下,需要使用API​​密钥。并且,请在API控制台启用PagesPeed Insights API。

2。示例脚本。

function myFunction() {
  const apiKey = "###"; // Please set your API key.
  const url = "http://www.juicecoldpressed.com/"; // Please set URL.

  const apiEndpoint = `https://www.googleapis.com/pagespeedonline/v5/runPagespeed?key=${apiKey}&url=${encodeURIComponent(url)}&category=performance`;
  const strategy = ["desktop", "mobile"];
  const res = UrlFetchApp.fetchAll(strategy.map(e => ({ url: `${apiEndpoint}&strategy=${e}`, muteHttpExceptions: true })));
  const values = res.reduce((o, r, i) => {
    if (r.getResponseCode() == 200) {
      const obj = JSON.parse(r.getContentText());
      o[strategy[i]] = obj.lighthouseResult.categories.performance.score * 100;
    } else {
      o[strategy[i]] = null;
    }
    return o;
  }, {});
  
  console.log(values);
}

3。测试。

运行此脚本后,您可以在日志中查看{桌面:##,移动:##}的返回值。 desktopMobile的值分别是桌面和移动设备的值。

参考:

Issue and workaround:

In this case, I'm worried that your goal might not be able to be directly achieved using the URL of https://pagespeed.web.dev/report?url=http%3A%2F%2Fwww.juicecoldpressed.com%2F and Cheerio. Because the HTML data retrieved from UrlFetchApp.fetch(url) is different from that on the browser. And, it seems that the value is calculated using a script.

Fortunately, in your situation, I thought that your values can be retrieved using PageSpeed Insights API. In this answer, I would like to propose achieving your goal using PageSpeed Insights API.

Usage:

1. Get Started with the PageSpeed Insights API.

Please check the official document for using PageSpeed Insights API. In this case, it is required to use your API key. And, please enable PageSpeed Insights API at the API console.

2. Sample script.

function myFunction() {
  const apiKey = "###"; // Please set your API key.
  const url = "http://www.juicecoldpressed.com/"; // Please set URL.

  const apiEndpoint = `https://www.googleapis.com/pagespeedonline/v5/runPagespeed?key=${apiKey}&url=${encodeURIComponent(url)}&category=performance`;
  const strategy = ["desktop", "mobile"];
  const res = UrlFetchApp.fetchAll(strategy.map(e => ({ url: `${apiEndpoint}&strategy=${e}`, muteHttpExceptions: true })));
  const values = res.reduce((o, r, i) => {
    if (r.getResponseCode() == 200) {
      const obj = JSON.parse(r.getContentText());
      o[strategy[i]] = obj.lighthouseResult.categories.performance.score * 100;
    } else {
      o[strategy[i]] = null;
    }
    return o;
  }, {});
  
  console.log(values);
}

3. Testing.

When this script is run, you can see the returned value of { desktop: ##, mobile: ## } at the log. The values (the unit is %.) of desktop and mobile are the values for the desktop and the mobile, respectively.

Reference:

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文