使用 python scrapy 将项目抽出到 csv 文件 - 如何在 csv 文件中输出的问题

发布于 2024-11-19 07:53:50 字数 933 浏览 2 评论 0原文

有一个问题,我想将输出添加到 csv 文件,但它不会在字段名称下方开始,而是按顺序放置在下一行中,而不是在填充 csv 文件中的playerMins 项目时将其放置在第 2 行。有人可以告诉我我的代码哪里出了问题吗?这里是:

class EspnSpider3(BaseSpider):
    name = "espn3.org"
    allowed_domains = ["espn3.org"]
    start_urls = [
        "http://scores.espn.go.com/nba/boxscore?gameId=310502004"

    ]

    def parse(self, response):
        hxs = HtmlXPathSelector(response)
        item = EspnItem()
        rows = []
        playerName = []
        playerMins = []

        # player names 
        p_names = hxs.select('(//table[@class="mod-data"][1]/tbody/tr)//a/text()').extract()
        for p_name in p_names:
            print p_name
            yield EspnItem(playerName=p_name)

        # minutes
        p_minutes = hxs.select('(//table[@class="mod-data"][1]/tbody/tr)/td[2]').extract()
        for p_minute in p_minutes:
            print p_minute
            yield EspnItem(playerMins=p_minute)

Having an issue where I want to add output to a csv file but it does not start below the field name it is placed in the next row in sequence as opposed to placing it at row 2 when populating the playerMins item in the csv file. Can someone please tell me where my code is going wrong?? Here it is:

class EspnSpider3(BaseSpider):
    name = "espn3.org"
    allowed_domains = ["espn3.org"]
    start_urls = [
        "http://scores.espn.go.com/nba/boxscore?gameId=310502004"

    ]

    def parse(self, response):
        hxs = HtmlXPathSelector(response)
        item = EspnItem()
        rows = []
        playerName = []
        playerMins = []

        # player names 
        p_names = hxs.select('(//table[@class="mod-data"][1]/tbody/tr)//a/text()').extract()
        for p_name in p_names:
            print p_name
            yield EspnItem(playerName=p_name)

        # minutes
        p_minutes = hxs.select('(//table[@class="mod-data"][1]/tbody/tr)/td[2]').extract()
        for p_minute in p_minutes:
            print p_minute
            yield EspnItem(playerMins=p_minute)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

一身软味 2024-11-26 07:53:50

经过大量谷歌搜索和 rtfm 后,能够解决我的问题: 尝试在 Scrapy 中使用 ItemExporter

这是我的工作代码:

def parse(self, response):
    hxs = HtmlXPathSelector(response)
    player_names = hxs.select('(//table[@class="mod-data"][1]/tbody/tr)')
    for p_name in player_names:
        l = XPathItemLoader(item=EspnItem(), selector=p_name )
        l.add_xpath('playerName', 'td[1]/a/text()')
        l.add_xpath('playerMins', 'td[2]')
        yield l.load_item() 

Was able to solve my issue, after much googling and rtfm: Trying to Use an ItemExporter in Scrapy

Here is my working code:

def parse(self, response):
    hxs = HtmlXPathSelector(response)
    player_names = hxs.select('(//table[@class="mod-data"][1]/tbody/tr)')
    for p_name in player_names:
        l = XPathItemLoader(item=EspnItem(), selector=p_name )
        l.add_xpath('playerName', 'td[1]/a/text()')
        l.add_xpath('playerMins', 'td[2]')
        yield l.load_item() 
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文