剧作家的零工贝壳
是否可以在砂壳中援引剧作家?
我想使用壳测试我的XPATH,我打算将其放在结合剧作家的蜘蛛中。
我的零工设置文件具有通常的剧作家设置:
# Scrapy Playwright Setup
DOWNLOAD_HANDLERS = {
"http": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
"https": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
}
TWISTED_REACTOR = "twisted.internet.asyncioreactor.AsyncioSelectorReactor"
Is it possible to invoke Playwright in a Scrapy shell?
I would like to use a shell to test my xpaths, which I intend to place in a spider that incorporates Scrapy Playwright.
My scrapy settings file has the usual Playwright setup:
# Scrapy Playwright Setup
DOWNLOAD_HANDLERS = {
"http": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
"https": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
}
TWISTED_REACTOR = "twisted.internet.asyncioreactor.AsyncioSelectorReactor"
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
是的,有可能。实际上,您所要做的就是只是在包含废纸项目的文件夹中运行废壳。它将自动从settings.py加载所有默认设置。运行零工外壳时,您可以在日志上看到它。
另外,您可以使用
-s
参数覆盖设置。快乐刮擦:)
Yes, It is possible. In fact, all you have to do is just running scrapy shell inside a folder that contains a scrapy project. It will automatically load all the default settings from settings.py. You can see it on the logs when running scrapy shell.
Also, You can override settings using the
-s
parameters.Happy Scraping :)
我也有同样的问题。除了您在设置中使用的剧作家配置,并从该纸巾项目中运行外壳外,我还必须通过kwarg来启动外壳后进行获取,如下:
然后,您可以按照平常的方式运行命令scrapy壳,例如:
I had the same issue. In addition to the Playwright configuration you have in your settings.py, and running your shell from within that scrapy project, I had to pass a kwarg to fetch after starting the shell, like this:
You can then run commands as you normally would in scrapy shell, such as:
我相信Shell Command可能与剧作家无法使用。在这里,我使用Python3作为演示:
此文档链接应进一步帮助您:
https://playwright.dev/python/docs/intro#interactive-mode- REPL
我相信,您只需要在交互式模式下需要Python3或Python3而不是外壳。这样,您可以完成scrapy shell从未做到的自动完成。
这是一个名为spider_interactive.py的文件中的同步示例:
运行:
python3 -i spider_interactive.py
,然后您可以输入以下命令:
returns: returns
['mozilla/5.0(x11; linux x86_64; rv:rv:100.0)gecko/gecko/gecko/gecko/gecko/gecko/ 20100101 Firefox/100.0','我的IP地址:your_ip_address_here]
I believe the shell command might not be possible to do with scrapy playwright. Here i am using python3 as demonstration:
This documentation link should help you further:
https://playwright.dev/python/docs/intro#interactive-mode-repl
I believe instead of shell you just need python3 or python3 in interactive mode. This way you have auto complete which the scrapy shell never did.
Here is the synchronous example in a file called spider_interactive.py:
Run with:
python3 -i spider_interactive.py
Then you can enter for example the following command:
returns
['Mozilla/5.0 (X11; Linux x86_64; rv:100.0) Gecko/20100101 Firefox/100.0', 'My IP Address: your_ip_address_here]