解析这个字符串最快的方法是什么
我有一个字符串,格式如下:
[Season] [Year] [Vendor] [Geography]
所以一个例子可能是:Spring 2009 Nielsen MSA
我需要能够解析出 Season并以最快的方式年份。我不在乎漂亮或聪明。只是原始速度。语言是使用 VS2008 的 C#,但程序集是为 .NET 2.0 构建的
I have a string, that is in the following format:
[Season] [Year] [Vendor] [Geography]
so an example might be: Spring 2009 Nielsen MSA
I need to be able to parse out Season and Year in the fastest way possible. I don't care about prettiness or cleverness. Just raw speed. The language is C# using VS2008, but the assembly is being built for .NET 2.0
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
类解析器:
使用:
Class Parser:
Use:
我接受了 Spidey 的建议,它应该具有足够好的性能,但具有简单、易于遵循、易于维护的代码。
但如果你真的需要提升性能。信封(C# 是唯一可用的工具),那么可能需要几个串联的循环来搜索空格,然后使用 substr 拉出字符串,这会稍微超过它。
您可以使用 IndexOf 而不是循环执行相同的操作,但滚动您自己的可能会稍微快一些(但您必须对此进行分析)。
I'd got with Spidey's suggestion, which should be decent enough performance, but with simple, easy to follow, easy to maintain code.
But if you really need to push the perf. envelope (and C# is the only tool available) then probably a couple of loops in series that search for the spaces, then pull the strings out using substr would marginally outdo it.
You could do the same with IndexOf instead of the loops, but rolling your own may be slightly faster (but you'd have to profile that).
如果你只需要季节和年份,那么:
如果你可以假设年份总是四位数,这甚至更快:
如果另外你知道所有年份都在 21 世纪,它可能会变得愚蠢的最佳:
这变得更糟可读,但可能更快(取决于 JIT 的作用):就
我个人而言,我认为这至少迈得太远了:)
编辑:好的,把这个推向极端......你只是有几个季节,对吗?假设它们是“Spring”、“Summer”、“Fall”、“Winter”,那么您可以这样做:
这样做的优点是它将重用相同的字符串对象。当然,它假设数据永远不会有任何问题......
如蜘蛛侠的答案所示使用
Split
肯定比任何一个都简单,但我怀疑它'会稍微慢一点。说实话,我至少会首先尝试...您是否测量过最简单的代码并发现它太慢了?差异可能非常微小 - 当然与您首先读取数据的任何网络或磁盘访问相比。If you only need the season and year, then:
If you can assume the year is always four digits, this is even faster:
If additionally you know that all years are in the 21st century, it can get stupidly optimal:
which becomes even less readable but possibly faster (depending on what the JIT does) as:
Personally I think that's at least one step too far though :)
EDIT: Okay, taking this to extremes... you're only going to have a few seasons, right? Suppose they're "Spring", "Summer", "Fall", "Winter" then you can do:
This has the advantage that it will reuse the same string objects. Of course, it assumes that there's never anything wrong with the data...
Using
Split
as shown in Spidey's answer is certainly simpler than any of this, but I suspect it'll be slightly slower. To be honest, I'd at least try that first... have you measured the simplest code and found that it's too slow? The difference is likely to be very slight - certainly compared with whatever network or disk access you've got reading in the data in the first place.要添加到其他答案,如果您希望它们采用这种格式:
那么更快的方法是:
不过,那是相当令人讨厌。 :)
我不会甚至考虑这样的编码。
To add to the other answers, if you are expecting them to be in this format:
then an even faster way would be:
That is rather nasty, though. :)
I wouldn't even consider coding like this.
试试这个。
Try this.