使用 python scrapy 将项目抽出到 csv 文件 - 如何在 csv 文件中输出的问题
有一个问题,我想将输出添加到 csv 文件,但它不会在字段名称下方开始,而是按顺序放置在下一行中,而不是在填充 csv 文件中的playerMins 项目时将其放置在第 2 行。有人可以告诉我我的代码哪里出了问题吗?这里是:
class EspnSpider3(BaseSpider):
name = "espn3.org"
allowed_domains = ["espn3.org"]
start_urls = [
"http://scores.espn.go.com/nba/boxscore?gameId=310502004"
]
def parse(self, response):
hxs = HtmlXPathSelector(response)
item = EspnItem()
rows = []
playerName = []
playerMins = []
# player names
p_names = hxs.select('(//table[@class="mod-data"][1]/tbody/tr)//a/text()').extract()
for p_name in p_names:
print p_name
yield EspnItem(playerName=p_name)
# minutes
p_minutes = hxs.select('(//table[@class="mod-data"][1]/tbody/tr)/td[2]').extract()
for p_minute in p_minutes:
print p_minute
yield EspnItem(playerMins=p_minute)
Having an issue where I want to add output to a csv file but it does not start below the field name it is placed in the next row in sequence as opposed to placing it at row 2 when populating the playerMins item in the csv file. Can someone please tell me where my code is going wrong?? Here it is:
class EspnSpider3(BaseSpider):
name = "espn3.org"
allowed_domains = ["espn3.org"]
start_urls = [
"http://scores.espn.go.com/nba/boxscore?gameId=310502004"
]
def parse(self, response):
hxs = HtmlXPathSelector(response)
item = EspnItem()
rows = []
playerName = []
playerMins = []
# player names
p_names = hxs.select('(//table[@class="mod-data"][1]/tbody/tr)//a/text()').extract()
for p_name in p_names:
print p_name
yield EspnItem(playerName=p_name)
# minutes
p_minutes = hxs.select('(//table[@class="mod-data"][1]/tbody/tr)/td[2]').extract()
for p_minute in p_minutes:
print p_minute
yield EspnItem(playerMins=p_minute)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
经过大量谷歌搜索和 rtfm 后,能够解决我的问题: 尝试在 Scrapy 中使用 ItemExporter
这是我的工作代码:
Was able to solve my issue, after much googling and rtfm: Trying to Use an ItemExporter in Scrapy
Here is my working code: