剧作家的零工贝壳

发布于 2025-02-05 16:24:01 字数 386 浏览 1 评论 0原文

是否可以在砂壳中援引剧作家？

我想使用壳测试我的XPATH，我打算将其放在结合剧作家的蜘蛛中。

我的零工设置文件具有通常的剧作家设置：

# Scrapy Playwright Setup
DOWNLOAD_HANDLERS = {
    "http": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
    "https": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
}

TWISTED_REACTOR = "twisted.internet.asyncioreactor.AsyncioSelectorReactor"

原文

Is it possible to invoke Playwright in a Scrapy shell?

I would like to use a shell to test my xpaths, which I intend to place in a spider that incorporates Scrapy Playwright.

My scrapy settings file has the usual Playwright setup:

# Scrapy Playwright Setup
DOWNLOAD_HANDLERS = {
    "http": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
    "https": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
}

TWISTED_REACTOR = "twisted.internet.asyncioreactor.AsyncioSelectorReactor"

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

难得心□动 2025-02-12 16:24:03

是的，有可能。实际上，您所要做的就是只是在包含废纸项目的文件夹中运行废壳。它将自动从settings.py加载所有默认设置。运行零工外壳时，您可以在日志上看到它。

另外，您可以使用-s参数覆盖设置。

scrapy shell -s DOWNLOAD_HANDLERS='<<your custom handlers>>'

快乐刮擦:)

Yes, It is possible. In fact, all you have to do is just running scrapy shell inside a folder that contains a scrapy project. It will automatically load all the default settings from settings.py. You can see it on the logs when running scrapy shell.

Also, You can override settings using the -s parameters.

scrapy shell -s DOWNLOAD_HANDLERS='<<your custom handlers>>'

Happy Scraping :)

回复收藏 0 原文

反目相谮 2025-02-12 16:24:03

我也有同样的问题。除了您在设置中使用的剧作家配置，并从该纸巾项目中运行外壳外，我还必须通过kwarg来启动外壳后进行获取，如下：

scrapy shell
fetch('<url-of-request>', meta={"playwright": True})

然后，您可以按照平常的方式运行命令scrapy壳，例如：

view(response)

I had the same issue. In addition to the Playwright configuration you have in your settings.py, and running your shell from within that scrapy project, I had to pass a kwarg to fetch after starting the shell, like this:

scrapy shell
fetch('<url-of-request>', meta={"playwright": True})

You can then run commands as you normally would in scrapy shell, such as:

view(response)

回复收藏 0 原文

没︽人懂的悲伤 2025-02-12 16:24:03

我相信Shell Command可能与剧作家无法使用。在这里，我使用Python3作为演示：

此文档链接应进一步帮助您：
https://playwright.dev/python/docs/intro#interactive-mode- REPL

我相信，您只需要在交互式模式下需要Python3或Python3而不是外壳。这样，您可以完成scrapy shell从未做到的自动完成。

这是一个名为spider_interactive.py的文件中的同步示例：

from playwright.sync_api import sync_playwright
playwright = sync_playwright().start()
browser = playwright.firefox.launch()
page = browser.new_page()
page.goto("http://whatsmyuseragent.org/")

#Remember to run these manually when your done to prevent left over garbage on the machine.
#browser.close()
#playwright.stop()

运行：

python3 -i spider_interactive.py

，然后您可以输入以下命令：

page.locator("p.intro-text").all_inner_texts()

returns： returns

['mozilla/5.0（x11; linux x86_64; rv：rv：100.0）gecko/gecko/gecko/gecko/gecko/gecko/ 20100101 Firefox/100.0'，'我的IP地址：your_ip_address_here]

I believe the shell command might not be possible to do with scrapy playwright. Here i am using python3 as demonstration:

This documentation link should help you further:
https://playwright.dev/python/docs/intro#interactive-mode-repl

I believe instead of shell you just need python3 or python3 in interactive mode. This way you have auto complete which the scrapy shell never did.

Here is the synchronous example in a file called spider_interactive.py:

from playwright.sync_api import sync_playwright
playwright = sync_playwright().start()
browser = playwright.firefox.launch()
page = browser.new_page()
page.goto("http://whatsmyuseragent.org/")

#Remember to run these manually when your done to prevent left over garbage on the machine.
#browser.close()
#playwright.stop()

Run with:

python3 -i spider_interactive.py

Then you can enter for example the following command:

page.locator("p.intro-text").all_inner_texts()

returns

['Mozilla/5.0 (X11; Linux x86_64; rv:100.0) Gecko/20100101 Firefox/100.0', 'My IP Address: your_ip_address_here]

回复收藏 0 原文

~没有更多了~

关于作者

挽手叙旧

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

剧作家的零工贝壳

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

十二

飞烟轻若梦

OPleyuhuo

wxb0109

旧城空念

-小熊_

友情链接

剧作家的零工贝壳

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

十二

飞烟轻若梦

OPleyuhuo

wxb0109

旧城空念

-小熊_

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。