apefify如何停止在浏览器中打开新页面的Enqueuelink函数

发布于 2025-02-07 08:47:21 字数 1679 浏览 1 评论 0原文

我正在尝试从页面获得链接,然后通过单击按钮导航到下一页。问题是我首先需要将第一页上的所有链接添加到队列中,但是当我尝试单击元素时,Enqueuelist函数似乎启动了一个新页面,从而导致“未找到错误”。 任何建议都会有所帮助!

exports.handleList = async ({ request, page }, requestQueue) => {
    await Apify.utils.enqueueLinks({
        page: page,
        requestQueue: requestQueue,
        selector: "#traziPoduzeca > tbody > tr > td > span > a",
        baseUrl: "https://www.fininfo.hr/Poduzece/[.*]",
        transformRequestFunction: (request) => {
            request.userData = {
                label: "DETAIL",
            };
            return request;
        },
    });

    log.info('about to scrap urls')

    
    await page.waitForTimeout(120);
    
    let btn_selector =
        "//div[@class='contentNav'][1]//div[@class='pagination'][1]//span[@class='current']/following-sibling::a";
    let buttonEle = await page.$x(btn_selector);

    log.info(`length of pagination is ${buttonEle.length}`);

    for (let index = 0; index < buttonEle.length; index++) {
        await page.waitForTimeout(120);

        await buttonEle[index].click();

        log.info("clicked");

        await page.waitForTimeout(200);

        await Apify.utils.enqueueLinks({
            page: page,
            requestQueue: requestQueue,
            selector: "#traziPoduzeca > tbody > tr > td > span > a",
            baseUrl: "https://www.fininfo.hr/Poduzece/[.*]",
            transformRequestFunction: (request) => {
                request.userData = {
                    label: "DETAIL",
                };
                return request;
            },
        });
    }
};

I am trying to get links from a page and then navigate to the next page through a click of a button. The issue is I first need to add all the links on the first page to a queue however the enqueueList function seems to start a new page, causing a "Node not found error" when I try to click on an element.
Any advice would be helpful!

exports.handleList = async ({ request, page }, requestQueue) => {
    await Apify.utils.enqueueLinks({
        page: page,
        requestQueue: requestQueue,
        selector: "#traziPoduzeca > tbody > tr > td > span > a",
        baseUrl: "https://www.fininfo.hr/Poduzece/[.*]",
        transformRequestFunction: (request) => {
            request.userData = {
                label: "DETAIL",
            };
            return request;
        },
    });

    log.info('about to scrap urls')

    
    await page.waitForTimeout(120);
    
    let btn_selector =
        "//div[@class='contentNav'][1]//div[@class='pagination'][1]//span[@class='current']/following-sibling::a";
    let buttonEle = await page.$x(btn_selector);

    log.info(`length of pagination is ${buttonEle.length}`);

    for (let index = 0; index < buttonEle.length; index++) {
        await page.waitForTimeout(120);

        await buttonEle[index].click();

        log.info("clicked");

        await page.waitForTimeout(200);

        await Apify.utils.enqueueLinks({
            page: page,
            requestQueue: requestQueue,
            selector: "#traziPoduzeca > tbody > tr > td > span > a",
            baseUrl: "https://www.fininfo.hr/Poduzece/[.*]",
            transformRequestFunction: (request) => {
                request.userData = {
                    label: "DETAIL",
                };
                return request;
            },
        });
    }
};

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

疧_╮線 2025-02-14 08:47:21

导致错误的不是Enqueuelinks函数。如果您想留在同一页面上,则需要等待导航,类似的事情:

const btn_selector =
        "//div[@class='contentNav'][1]//div[@class='pagination'][1]//span[@class='current']/following-sibling::a";

// log.info(`length of pagination is ${buttonEle.length}`);

for (;;) {
  // the button handle will change everytime the navigation happens
  // so you need to re-evaluate every time
  const [buttonEle] = await page.$x(btn_selector);
   
  if (!buttonEle) {
    log.info("no pagination element found");
    break;
  }
  
  await page.waitForTimeout(120);

  await Promise.all([
     page.waitForNavigation(), // this will not destroy the context of current page
     buttonEle.click(), 
  ]);

  log.info("clicked");

  await page.waitForTimeout(200);

  // enqueue links...
}

It's not the enqueueLinks function that is causing the error, is the navigation that is happening. You need to await for the navigation if you want to stay on the same page, something like this:

const btn_selector =
        "//div[@class='contentNav'][1]//div[@class='pagination'][1]//span[@class='current']/following-sibling::a";

// log.info(`length of pagination is ${buttonEle.length}`);

for (;;) {
  // the button handle will change everytime the navigation happens
  // so you need to re-evaluate every time
  const [buttonEle] = await page.$x(btn_selector);
   
  if (!buttonEle) {
    log.info("no pagination element found");
    break;
  }
  
  await page.waitForTimeout(120);

  await Promise.all([
     page.waitForNavigation(), // this will not destroy the context of current page
     buttonEle.click(), 
  ]);

  log.info("clicked");

  await page.waitForTimeout(200);

  // enqueue links...
}

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文