Python的yield和return语句？和 Scrapy 产生请求

发布于 2024-12-25 20:21:50 字数 275 浏览 0 评论 0原文

举例说明yield和return有什么区别？当我们在生成器中产生任何值或请求时，实际上会发生什么？

我没有从任何其他函数或程序调用我的生成器。

我的循环是：

for index in range(3):
  yield Request(url,callback=parse)

这是对特定网址发出请求并在请求后调用回调函数。这段代码是做什么的？

代码后面的顺序是什么？

原文

What is the difference between yield and return explain with example?
and what actually happens when in the generator we yield any value or request?

I'm not calling my generator from any other function or program.

My loop is:

for index in range(3):
  yield Request(url,callback=parse)

This is making requests on the specific url and calling the callback function after the request. What this code is doing?

And what is the sequence followed by the code?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

往日情怀 2025-01-01 20:21:50

我想您在函数 start_requests() 中遇到了这个难题，其中包含上下文 yield 。

例如：

def start_requests(self):
    urls = [
        'http://quotes.toscrape.com/page/1/',
        'http://quotes.toscrape.com/page/2/',
    ]
    for url in urls:
        yield scrapy.Request(url=url, callback=self.parse)

当你引用scrapy的文档 spider 然后找到名为 start_requests() 的函数，它表示该方法必须返回一个可迭代对象。如果你将yield更改为return，它就不是一个可迭代的，因为当你启动你的spider时，for循环已经结束了。这可能会变得一团糟。

很自然，您的蜘蛛应该将 http 请求一一发送到这些目的地，因此最好的方法是生成器。在 for 循环中，您的蜘蛛将在 yield 处停止并返回 scrapy.Request()，完成所有操作后，您的蜘蛛将 send() > 到生成器并继续下一步
列表中的以下网址。

I guess you are faced with the puzzle in the function start_requests() with the context yield in it.

For example:

def start_requests(self):
    urls = [
        'http://quotes.toscrape.com/page/1/',
        'http://quotes.toscrape.com/page/2/',
    ]
    for url in urls:
        yield scrapy.Request(url=url, callback=self.parse)

When you refer to the document of scrapy spider and then find the function named start_requests(),it says the method must return an iterable. If you change yield to return, it is not an iterable because the for loop is already over when you start your spider.It could be a mess.

It is natural that your spider should send http requests to these destinations one by one so the best way is a generator. In the for loop, your spider will stop at yield and return scrapy.Request(), with all things done, your spider will send() to generator and move on to next
following urls in the list.

回复收藏 0 原文