用美丽的图像URL刮擦图像URL
尝试今天学习一些东西并做一些刮擦。
我正在尝试将产品名称和相应的图像URL列入电子表格。
我设法存储了名称,但图像似乎不起作用。希望您能提供帮助!
这是我用来提取文本的代码:
results[0].find('p', {'class': 'product-card__name'}).get_text()
这是我认为会提取图像的方法:
results[0].find('img', {'class':'product-card__image'}).get_src()
这显然是不起作用的。返回“'nontype'对象不可呼应”
吗?
作为参考,下面是我要刮擦的来源。
<li class="product-grid__item"><a href="/p/63818/bumbu-the-original-rum-glass-pack" class="product-card" title=" Bumbu The Original Rum Glass Pack" onclick="_gaq.push(['_trackEvent', 'Products-GridView', 'click', '63818 : Bumbu The Original Rum / Glass Pack'])"><div class="product-card__image-container"><img src="https://img.thewhiskyexchange.com/480/rum_bum4.jpg" alt="Bumbu The Original Rum Glass Pack" class="product-card__image" loading="lazy" width="3" height="4"></div><div class="product-card__content"><p class="product-card__name"> Bumbu The Original Rum<span class="product-card__name-secondary">Glass Pack</span></p><p class="product-card__meta"> 70cl / 40% </p></div><div class="product-card__data"><p class="product-card__price"> £39.95 </p><p class="product-card__unit-price"> (£57.07 per litre) </p></div></a></li>
Trying to learn something today and doing a bit of scraping.
I am trying to list product names and corresponding image URLs into a spreadsheet.
I managed to store the names but the images don't seem to work. Hopefully you can help!
Here is the code I use for extracting the text:
results[0].find('p', {'class': 'product-card__name'}).get_text()
Here is what I thought would extract the image:
results[0].find('img', {'class':'product-card__image'}).get_src()
This is obvioulsy not working.Returning that "'NoneType' object is not callable"
Any pointers?
For reference, below is the source I am trying to scrape.
<li class="product-grid__item"><a href="/p/63818/bumbu-the-original-rum-glass-pack" class="product-card" title=" Bumbu The Original Rum Glass Pack" onclick="_gaq.push(['_trackEvent', 'Products-GridView', 'click', '63818 : Bumbu The Original Rum / Glass Pack'])"><div class="product-card__image-container"><img src="https://img.thewhiskyexchange.com/480/rum_bum4.jpg" alt="Bumbu The Original Rum Glass Pack" class="product-card__image" loading="lazy" width="3" height="4"></div><div class="product-card__content"><p class="product-card__name"> Bumbu The Original Rum<span class="product-card__name-secondary">Glass Pack</span></p><p class="product-card__meta"> 70cl / 40% </p></div><div class="product-card__data"><p class="product-card__price"> £39.95 </p><p class="product-card__unit-price"> (£57.07 per litre) </p></div></a></li>
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
要获取图像URL,您必须调用
.get('src')
而不是.get_src()
示例:
输出:
To grab the image url, you have to call
.get('src')
instead of.get_src()
Example:
Output: