我想让 stackoverflow 社区注意到这个挑战。 原始问题和答案位于此处。 顺便说一句,如果您之前没有关注过,您应该尝试阅读 Eric 的博客,这是纯粹的智慧。
摘要:
编写一个函数,该函数采用非空 IEnumerable 并返回具有以下特征的字符串:
- 如果序列为空,则结果字符串为“{}”。
- 如果序列是单个项目“ABC”,则结果字符串是“{ABC}”。
- 如果序列是两项序列“ABC”、“DEF”,则结果字符串是“{ABC 和 DEF}”。
- 如果序列包含两个以上的项目,例如“ABC”、“DEF”、“G”、“H”,则结果字符串为“{ABC、DEF、G 和 H}”。 (注意:没有牛津逗号!)
正如您所见,甚至我们自己的 Jon Skeet (是的,众所周知 他可以同时在两个地方)已经发布了一个解决方案,但他的(恕我直言)并不是最优雅的,尽管你可能无法击败它的性能。
你怎么认为? 那里有相当不错的选择。 我真的很喜欢涉及选择和聚合方法的解决方案之一(来自 Fernando Nicolet)。 Linq 非常强大,花一些时间来应对这样的挑战会让您学到很多东西。 我稍微扭曲了它,使其性能更高、更清晰(通过使用 Count 并避免 Reverse):
public static string CommaQuibbling(IEnumerable<string> items)
{
int last = items.Count() - 1;
Func<int, string> getSeparator = (i) => i == 0 ? string.Empty : (i == last ? " and " : ", ");
string answer = string.Empty;
return "{" + items.Select((s, i) => new { Index = i, Value = s })
.Aggregate(answer, (s, a) => s + getSeparator(a.Index) + a.Value) + "}";
}
I wanted to bring this challenge to the attention of the stackoverflow community. The original problem and answers are here. BTW, if you did not follow it before, you should try to read Eric's blog, it is pure wisdom.
Summary:
Write a function that takes a non-null IEnumerable and returns a string with the following characteristics:
- If the sequence is empty the resulting string is "{}".
- If the sequence is a single item "ABC" then the resulting string is "{ABC}".
- If the sequence is the two item sequence "ABC", "DEF" then the resulting string is "{ABC and DEF}".
- If the sequence has more than two items, say, "ABC", "DEF", "G", "H" then the resulting string is "{ABC, DEF, G and H}". (Note: no Oxford comma!)
As you can see even our very own Jon Skeet (yes, it is well known that he can be in two places at the same time) has posted a solution but his (IMHO) is not the most elegant although probably you can not beat its performance.
What do you think? There are pretty good options there. I really like one of the solutions that involves the select and aggregate methods (from Fernando Nicolet). Linq is very powerful and dedicating some time to challenges like this make you learn a lot. I twisted it a bit so it is a bit more performant and clear (by using Count and avoiding Reverse):
public static string CommaQuibbling(IEnumerable<string> items)
{
int last = items.Count() - 1;
Func<int, string> getSeparator = (i) => i == 0 ? string.Empty : (i == last ? " and " : ", ");
string answer = string.Empty;
return "{" + items.Select((s, i) => new { Index = i, Value = s })
.Aggregate(answer, (s, a) => s + getSeparator(a.Index) + a.Value) + "}";
}
发布评论
评论(27)
我尝试过使用foreach。 请让我知道您的意见。
I have tried using foreach. Please let me know your opinions.
以下是基于 http://blogs.perl.org/users/brian_d_foy/2013/10/comma-quibbling-in-perl.html。
Here are a couple of solutions and testing code written in Perl based on the replies at http://blogs.perl.org/users/brian_d_foy/2013/10/comma-quibbling-in-perl.html.
距离上一篇文章还不到十年,所以这是我的变体:
It hasn't quite been a decade since the last post so here's my variation:
在一份声明中:
In one statement:
在 .NET Core 中,我们可以利用 SkipLast 和 TakeLast。
https://dotnetfiddle.net/X58qvZ
In
.NET Core
we can leverage SkipLast and TakeLast.https://dotnetfiddle.net/X58qvZ
效率低下,但我思路清晰。
如果我要维护代码,我会更喜欢这个而不是更聪明的版本。
Inefficient, but I think clear.
If I was maintaining the code, I'd prefer this to more clever versions.
这种方法怎么样? 纯粹累积 - 没有回溯,并且只迭代一次。 对于原始性能,我不确定使用 LINQ 等会做得更好,无论 LINQ 答案有多“漂亮”。
How about this approach? Purely cumulative - no back-tracking, and only iterates once. For raw performance, I'm not sure you'll do better with LINQ etc, regardless of how "pretty" a LINQ answer might be.
如果我对需要第一个/最后一个信息的流做了很多工作,我会有这个扩展:
那么最简单的(不是最快的,需要一些更方便的扩展方法)解决方案将是:
If I was doing a lot with streams which required first/last information, I'd have thid extension:
Then the simplest (not the quickest, that would need a couple more handy extension methods) solution will be:
这里作为 Python oneliner
这个版本可能更容易理解
Here as a Python one liner
This version might be easier to understand
这是一个简单的 F# 解决方案,仅执行一次前向迭代:(
编辑:事实证明这与 Skeet 的非常相似。)
测试代码:
Here's a simple F# solution, that only does one forward iteration:
(EDIT: Turns out this is very similar to Skeet's.)
The test code:
我是连续逗号的粉丝:我吃,拍,然后离开。
我一直需要这个问题的解决方案,并且已经用 3 种语言(尽管不是 C#)解决了它。 我将通过编写
来调整以下解决方案(在 Lua 中,不会将答案括在大括号中)适用于任何
方法:IEnumerable
的 concatI'm a fan of the serial comma: I eat, shoot, and leave.
I continually need a solution to this problem and have solved it in 3 languages (though not C#). I would adapt the following solution (in Lua, does not wrap answer in curly braces) by writing a
concat
method that works on anyIEnumerable
:这不是很可读,但它可以很好地扩展到数千万个字符串。 我正在旧的 Pentium 4 工作站上进行开发,它在大约 350 毫秒内处理 1,000,000 个平均长度为 8 的字符串。
This isn't brilliantly readable, but it scales well up to tens of millions of strings. I'm developing on an old Pentium 4 workstation and it does 1,000,000 strings of average length 8 in about 350ms.
另一种变体 - 为了代码清晰而将标点符号和迭代逻辑分开。 并且仍在考虑性能。
按纯 IEnumerable/string/ 的要求工作,并且列表中的字符串不能为空。
F# 看起来确实好多了:
Another variant - separating punctuation and iteration logic for the sake of code clarity. And still thinking about perfomrance.
Works as requested with pure IEnumerable/string/ and strings in the list cannot be null.
F# surely looks much better:
免责声明:我以此为借口尝试新技术,因此我的解决方案并不能真正满足 Eric 对清晰度和可维护性的最初要求。
简单的枚举器解决方案
(我承认它的
foreach
变体更优秀,因为它不需要手动搞乱枚举器。)使用 LINQ 的解决方案
使用 TPL 的解决方案< 该解决
方案使用生产者-消费者队列将输入序列提供给处理器,同时在队列中保留至少两个缓冲元素。 一旦生产者到达输入序列的末尾,就可以对最后两个元素进行特殊处理。
事后看来,没有理由让消费者异步操作,这将消除对并发队列的需要,但正如我之前所说,我只是以此为借口来尝试新技术:-)
单元测试被省略了简洁。
Disclaimer: I used this as an excuse to play around with new technologies, so my solutions don't really live up to the Eric's original demands for clarity and maintainability.
Naive Enumerator Solution
(I concede that the
foreach
variant of this is superior, as it doesn't require manually messing about with the enumerator.)Solution using LINQ
Solution with TPL
This solution uses a producer-consumer queue to feed the input sequence to the processor, whilst keeping at least two elements buffered in the queue. Once the producer has reached the end of the input sequence, the last two elements can be processed with special treatment.
In hindsight there is no reason to have the consumer operate asynchronously, which would eliminate the need for a concurrent queue, but as I said previously, I was just using this as an excuse to play around with new technologies :-)
Unit tests elided for brevity.
迟到入场:
Late entry:
它的实现为,
It is implemented as,
这是我的,但我意识到它与 Marc 的非常相似,只是顺序上有一些细微的差别,而且我还添加了单元测试。
Here's mine, but I realize it's pretty much like Marc's, some minor differences in the order of things, and I added unit-tests as well.
跳过复杂的聚合代码并在构建字符串后清理字符串怎么样?
更新:正如评论中指出的,这不适用于带逗号的字符串。 我尝试了一些其他变体,但是没有关于字符串可以包含什么的明确规则,我将在使用正则表达式匹配任何可能的最后一项时遇到真正的问题,这对我来说是关于它们的局限性的一个很好的教训。
How about skipping complicated aggregation code and just cleaning up the string after you build it?
UPDATED: This won't work with strings with commas, as pointed out in the comments. I tried some other variations, but without definite rules about what the strings can contain, I'm going to have real problems matching any possible last item with a regular expression, which makes this a nice lesson for me on their limitations.
我非常喜欢乔恩的回答,但那是因为它很像我处理问题的方式。 我没有专门对这两个变量进行编码,而是在 FIFO 队列中实现它们。
这很奇怪,因为我只是假设会有 15 个帖子都做了完全相同的事情,但看起来我们是唯一两个这样做的人。 哦,看看这些答案,Marc Gravell 的答案也非常接近我们使用的方法,但他使用了两个“循环”,而不是坚持价值观。
但是所有这些关于 LINQ 和正则表达式以及连接数组的答案看起来都像是疯狂的谈话! :-)
I quite liked Jon's answer, but that's because it's much like how I approached the problem. Rather than specifically coding in the two variables, I implemented them inside of a FIFO queue.
It's strange because I just assumed that there would be 15 posts that all did exactly the same thing, but it looks like we were the only two to do it that way. Oh, looking at these answers, Marc Gravell's answer is quite close to the approach we used as well, but he's using two 'loops', rather than holding on to values.
But all those answers with LINQ and regex and joining arrays just seem like crazy-talk! :-)
我不认为使用一个好的旧数组是一个限制。 这是我使用数组和扩展方法的版本:
我使用数组是因为 string.Join 方法,并且因为可以通过索引访问最后一个元素。 扩展方法在这里是因为DRY。
我认为性能损失来自
list.ToArray()
和string.Join
调用,但我希望这段代码易于阅读和维护。I don't think that using a good old array is a restriction. Here is my version using an array and an extension method:
I am using an array because of the
string.Join
method and because if the possibility of accessing the last element via an index. The extension method is here because of DRY.I think that the performance penalities come from the
list.ToArray()
andstring.Join
calls, but all in one I hope that piece of code is pleasent to read and maintain.我认为 Linq 提供了相当可读的代码。 该版本在 0.89 秒内处理一百万个“ABC”:
I think Linq provides fairly readable code. This version handles a million "ABC" in .89 seconds:
您可以使用 foreach,而不使用 LINQ、委托、闭包、列表或数组,并且仍然拥有可理解的代码。 使用布尔值和字符串,如下所示:
You can use a foreach, without LINQ, delegates, closures, lists or arrays, and still have understandable code. Use a bool and a string, like so:
这是我的意见。 稍微修改了签名以使其更通用。 使用 .NET 4 功能(使用
IEnumerable
的String.Join()
),否则适用于 .NET 3.5。 目标是使用 LINQ 来大大简化逻辑。Here's my submission. Modified the signature a bit to make it more generic. Using .NET 4 features (
String.Join()
usingIEnumerable<T>
), otherwise works with .NET 3.5. Goal was to use LINQ with drastically simplified logic.有几个非 C# 答案,并且原始帖子确实要求任何语言的答案,所以我想我应该展示另一种 C# 程序员似乎没有触及过的方法:DSL!
精明的人会注意到,Common Lisp 并没有真正内置
IEnumerable
,因此这里的FORMAT
只能在正确的列表上工作。 但是,如果您创建了IEnumerable
,那么您当然也可以扩展FORMAT
来实现这一点。 (Clojure 有这个吗?)此外,任何有品味(包括 Lisp 程序员!)阅读本文的人都可能会被字面意思
"~{~#[~;~a~; ~a 和 ~a~:;~@{~a~#[~; 和 ~:;, ~]~}~]~}"
。 我不会声称 FORMAT 实现了良好的 DSL,但我确实相信拥有一些强大的 DSL 来放置字符串是非常有用的一起。 Regex 是一种用于拆分字符串的强大 DSL,而string.Format
是一种用于将字符串组合在一起的 DSL(某种程度上),但它非常弱。我想每个人都一直在写这类东西。 为什么还没有一些内置的通用 DSL 来实现这一点呢? 我认为我们最接近的可能是“Perl”。
There's a couple non-C# answers, and the original post did ask for answers in any language, so I thought I'd show another way to do it that none of the C# programmers seems to have touched upon: a DSL!
The astute will note that Common Lisp doesn't really have an
IEnumerable<T>
built-in, and henceFORMAT
here will only work on a proper list. But if you made anIEnumerable
, you certainly could extendFORMAT
to work on that, as well. (Does Clojure have this?)Also, anyone reading this who has taste (including Lisp programmers!) will probably be offended by the literal
"~{~#[~;~a~;~a and ~a~:;~@{~a~#[~; and ~:;, ~]~}~]~}"
there. I won't claim thatFORMAT
implements a good DSL, but I do believe that it is tremendously useful to have some powerful DSL for putting strings together. Regex is a powerful DSL for tearing strings apart, andstring.Format
is a DSL (kind of) for putting strings together but it's stupidly weak.I think everybody writes these kind of things all the time. Why the heck isn't there some built-in universal tasteful DSL for this yet? I think the closest we have is "Perl", maybe.
只是为了好玩,使用 C# 4.0 中的新 Zip 扩展方法:
Just for fun, using the new Zip extension method from C# 4.0: