有下载图像问题
我已经建立了一个简单的刮刀来从网站下载图像。不幸的是,我在下载这些图像时遇到了问题,因此没有下载任何内容。我已经在线搜索了类似的问题,并练习了这些问题,但对我不起作用。我过去曾做过这项工作,所以我不明白为什么它现在不起作用。
我的刮板:
import scrapy
from scrapy_exercises.items import ScrapyExercisesItem
class TestSpider(scrapy.Spider):
name = 'test'
start_urls = ['https://www.meadowhall.co.uk/eatdrinkshop?page=1']
def start_requests(self):
for url in self.start_urls:
yield scrapy.Request(
url=url,
callback=self.parse
)
def parse(self, response):
content_page = response.xpath("//div[@class='view-content']//div")
for cnt in content_page:
link = cnt.xpath('.//a/@href').get()
image_url = cnt.xpath(".//img//@src").get()
if link != None:
items = ScrapyExercisesItem()
items['images'] = [image_url.split('?')[0]]
yield items
pipelines.py
from scrapy.pipelines.images import ImagesPipeline
class DownfilesPipeline(ImagesPipeline):
def file_path(self, request, response=None, info=None):
image_name: str = request.url.split("/")[-1]
return image_name
settings.py
ITEM_PIPELINES = {
'scrapy_exercises.pipelines.DownfilesPipeline': 55
}
IMAGES_STORE = '.'
items.py:
class ScrapyExercisesItem(scrapy.Item):
images = scrapy.Field()
I have built a simple scraper to download images from a website. Unfortunately, I am having issues with downloading these images such that nothing gets downloaded. I have searched online for similar issues, and have practiced these but it does not work for me. I have had this work in the past, so I cannot understand why it does not work now.
My scraper:
import scrapy
from scrapy_exercises.items import ScrapyExercisesItem
class TestSpider(scrapy.Spider):
name = 'test'
start_urls = ['https://www.meadowhall.co.uk/eatdrinkshop?page=1']
def start_requests(self):
for url in self.start_urls:
yield scrapy.Request(
url=url,
callback=self.parse
)
def parse(self, response):
content_page = response.xpath("//div[@class='view-content']//div")
for cnt in content_page:
link = cnt.xpath('.//a/@href').get()
image_url = cnt.xpath(".//img//@src").get()
if link != None:
items = ScrapyExercisesItem()
items['images'] = [image_url.split('?')[0]]
yield items
pipelines.py
from scrapy.pipelines.images import ImagesPipeline
class DownfilesPipeline(ImagesPipeline):
def file_path(self, request, response=None, info=None):
image_name: str = request.url.split("/")[-1]
return image_name
settings.py
ITEM_PIPELINES = {
'scrapy_exercises.pipelines.DownfilesPipeline': 55
}
IMAGES_STORE = '.'
items.py:
class ScrapyExercisesItem(scrapy.Item):
images = scrapy.Field()
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为您需要做的就是添加一些设置,并在您的项目类中包含一个
字段
结果
。
I think all you need to do is add a few settings and include a results field in your item class
In your items.py file add this:
then in your settings.py file add this:
Then try it again.