循环列表并创建单个字符串的最快方法是什么?

发布于 2024-10-06 01:08:35 字数 359 浏览 2 评论 0原文

例如:

list = [{"title_url": "joe_white", "id": 1, "title": "Joe White"},
        {"title_url": "peter_black", "id": 2, "title": "Peter Black"}]

我如何有效地循环它来创建:

Joe White, Peter Black
<a href="/u/joe_white">Joe White</a>,<a href="/u/peter_black">Peter Black</a>

谢谢。

For example:

list = [{"title_url": "joe_white", "id": 1, "title": "Joe White"},
        {"title_url": "peter_black", "id": 2, "title": "Peter Black"}]

How can I efficiently loop through this to create:

Joe White, Peter Black
<a href="/u/joe_white">Joe White</a>,<a href="/u/peter_black">Peter Black</a>

Thank you.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

舂唻埖巳落 2024-10-13 01:08:35

第一个非常简单:

', '.join(item['title'] for item in list)

第二个需要更复杂的东西,但本质上是相同的:

','.join('<a href="/u/%(title_url)s">%(title)s</a>' % item for item in list)

两者都使用 生成器表达式,类似于列表推导式,无需额外创建列表

The first is pretty simple:

', '.join(item['title'] for item in list)

The second requires something more complicated, but is essentially the same:

','.join('<a href="/u/%(title_url)s">%(title)s</a>' % item for item in list)

Both use generator expressions, which are similar to list comprehensions without the need for an extra list creation

小情绪 2024-10-13 01:08:35

以下是一些速度比较,以检查您所获得的这两种方法。

首先,我们创建 100000 个条目的列表;无聊,而且由于琴弦较短,可能不是真正的样本,但我现在并不担心这一点。

>>> items = [{"title_url": "abc", "id": i, "title": "def"} for i in xrange(100000)]

首先,Michael Mrozek 的回答:

>>> def michael():
...     ', '.join(item['title'] for item in items)
...     ','.join('<a href="/u/%(title_url)s">%(title)s</a>' % item for item in items)
... 

很好而且简单。然后是 systempuntoout 的答案(请注意,在这个阶段我只是比较迭代性能,因此我已将 %s 和元组格式切换为 %()s 字典格式;稍后我将对其他方法进行计时):

>>> def systempuntoout():
...     titles = []
...     urls = []
...     for item in items:
...             titles.append(item['title'])
...             urls.append('<a href="/u/%(title_url)s">%(title)s</a>' % item)
...     ', '.join(titles)
...     ','.join(urls)
... 

很好。现在来计时:

>>> import timeit
>>> timeit.timeit(michael, number=100)
9.6959049701690674
>>> timeit.timeit(systempuntoout, number=100)
11.306489944458008

总结:不用担心两次遍历列表,与生成器理解相结合,它比 list.append 的开销要便宜; Michael 的解决方案在处理 100000 个条目时速度提高了约 15%。

其次,是否应该使用 '%(...)s' % dict() 还是 '%s' % tuple()。将 Michael 的答案视为两者中更快、更简单的一个,这是 michael2

>>> def michael2():
...     ', '.join(item['title'] for item in items)
...     ','.join('<a href="/u/%s">%s</a>' % (item['title_url'], item['title']) for item in items)
... 
>>> timeit.timeit(michael2, number=100)
7.8054699897766113

因此我们在这里得出明确的结论:使用元组进行字符串格式化比使用字典更快 - 快了近 25%。因此,如果性能是一个问题并且您正在处理大量数据,请使用此方法 michael2

如果你想看到一些真正可怕的东西,请采用 systempuntoout 的原始答案,并且类完好无损:

>>> def systempuntoout0():
...     class node():
...             titles = []
...             urls = []
...             def add_name(self, a_title):
...                     self.titles.append(a_title)
...             def add_link(self, a_title_url, a_title):
...                     self.urls.append('<a href="/u/%s">%s</a>' % (a_title_url, a_title))
...     node = node()
...     for entry in items:
...             node.add_name(entry["title"])
...             node.add_link(entry["title_url"], entry["title"])
...     ', '.join(node.titles)
...     ','.join(node.urls)
... 
>>> timeit.timeit(systempuntoout0, number=100)
15.253098011016846

michael2 慢两倍的阴影。


最后补充一点,对 Python 2.6 中引入的 str.format 进行基准测试,“字符串格式化的未来”(虽然我仍然不明白为什么,但我喜欢我的 % ,非常感谢;特别是因为它更快)。

>>> def michael_format():
...     ', '.join(item['title'] for item in items)
...     ','.join('<a href="/u/{title_url}">{title}</a>'.format(**item) for item in items)
... 
>>> timeit.timeit(michael_format, number=100)
11.809207916259766
>>> def michael2_format():
...     ', '.join(item['title'] for item in items)
...     ','.join('<a href="/u/{0}">{1}</a>'.format(item['title_url'], item['title']) for item in items)
... 
>>> timeit.timeit(michael2_format, number=100)
9.8876869678497314

11.81 而不是 9.70,9.89 而不是 7.81 - 它慢了 20-25%(还要考虑到它只是使用它的函数中的第二个表达式。

Here are some speed comparisons to check these two methods that you've been given.

First, we create the list of 100000 entries; boring and perhaps not a genuine sample due to having shorter strings, but I'm not worried about that now.

>>> items = [{"title_url": "abc", "id": i, "title": "def"} for i in xrange(100000)]

First, Michael Mrozek's answer:

>>> def michael():
...     ', '.join(item['title'] for item in items)
...     ','.join('<a href="/u/%(title_url)s">%(title)s</a>' % item for item in items)
... 

Nice and simple. Then systempuntoout's answer (note that at this stage I'm just comparing the iteration performance, and so I've switched the %s and tuple formatting for %()s dict formatting; I'll time the other method later):

>>> def systempuntoout():
...     titles = []
...     urls = []
...     for item in items:
...             titles.append(item['title'])
...             urls.append('<a href="/u/%(title_url)s">%(title)s</a>' % item)
...     ', '.join(titles)
...     ','.join(urls)
... 

Very well. Now to time them:

>>> import timeit
>>> timeit.timeit(michael, number=100)
9.6959049701690674
>>> timeit.timeit(systempuntoout, number=100)
11.306489944458008

Summary: don't worry about going over the list twice, combined with generator comprehension it's less expensive than the overhead of list.append; Michael's solution is about 15% faster on 100000 entries.

Secondly, there's whether you should use '%(...)s' % dict() or '%s' % tuple(). Taking Michael's answer as the faster and simpler of the two, here's michael2:

>>> def michael2():
...     ', '.join(item['title'] for item in items)
...     ','.join('<a href="/u/%s">%s</a>' % (item['title_url'], item['title']) for item in items)
... 
>>> timeit.timeit(michael2, number=100)
7.8054699897766113

And so we come to the clear conclusion here that the string formatting is faster with a tuple than a dict - almost 25% faster. So if performance is an issue and you're dealing with large quantities of data, use this method michael2.

And if you want to see something really scary, take systempuntoout's original answer with class intact:

>>> def systempuntoout0():
...     class node():
...             titles = []
...             urls = []
...             def add_name(self, a_title):
...                     self.titles.append(a_title)
...             def add_link(self, a_title_url, a_title):
...                     self.urls.append('<a href="/u/%s">%s</a>' % (a_title_url, a_title))
...     node = node()
...     for entry in items:
...             node.add_name(entry["title"])
...             node.add_link(entry["title_url"], entry["title"])
...     ', '.join(node.titles)
...     ','.join(node.urls)
... 
>>> timeit.timeit(systempuntoout0, number=100)
15.253098011016846

A shade under twice as slow as michael2.


One final addition, to benchmark str.format as introduced in Python 2.6, "the future of string formatting" (though I still don't understand why, I like my %, thank you very much; especially as it's faster).

>>> def michael_format():
...     ', '.join(item['title'] for item in items)
...     ','.join('<a href="/u/{title_url}">{title}</a>'.format(**item) for item in items)
... 
>>> timeit.timeit(michael_format, number=100)
11.809207916259766
>>> def michael2_format():
...     ', '.join(item['title'] for item in items)
...     ','.join('<a href="/u/{0}">{1}</a>'.format(item['title_url'], item['title']) for item in items)
... 
>>> timeit.timeit(michael2_format, number=100)
9.8876869678497314

11.81 instead of 9.70, 9.89 instead of 7.81 - it's 20-25% slower (consider also that it's only the second expression in the function which uses it, as well.

冰之心 2024-10-13 01:08:35
class node():
    titles = []
    urls = []
    def add_name(self, a_title):
        self.titles.append(a_title)
    def add_url(self, a_title_url, a_title):    
        self.urls.append('<a href="/u/%s">%s</a>' % (a_title_url, a_title))

node = node()
for entry in list:
    node.add_name(entry["title"])
    node.add_url(entry["title_url"],entry["title"])

print ','.join(node.titles)
print ','.join(node.urls)
class node():
    titles = []
    urls = []
    def add_name(self, a_title):
        self.titles.append(a_title)
    def add_url(self, a_title_url, a_title):    
        self.urls.append('<a href="/u/%s">%s</a>' % (a_title_url, a_title))

node = node()
for entry in list:
    node.add_name(entry["title"])
    node.add_url(entry["title_url"],entry["title"])

print ','.join(node.titles)
print ','.join(node.urls)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文