Eric Lippert挑战“逗号 – 狡辩”,最佳答案?

我想把这个挑战引起stackoverflow社区的注意。 原始问题和答案都在这里 。 顺便说一句,如果你之前没有按照它,你应该尝试阅读Eric的博客,这是纯粹的智慧。

摘要:

编写一个带有非null IEnumerable的函数,并返回一个具有以下特征的字符串:

  1. 如果序列为空,则生成的字符串为“{}”。
  2. 如果序列是单个项“ABC”,则结果字符串为“{ABC}”。
  3. 如果序列是两个项目序列“ABC”,“DEF”,那么结果字符串是“{ABC和DEF}”。
  4. 如果序列具有两个以上的项目,例如“ABC”,“DEF”,“G”,“H”,则结果字符串为“{ABC,DEF,G和H}”。 (注意:没有牛津逗号!)

你甚至可以看到我们自己的Jon Skeet(是的,众所周知, 他可以同时在两个地方 )发布了一个解决方案,但他的(恕我直言)并不是最优雅的,尽管你可能无法击败它性能。

你怎么看? 那里有很好的选择。 我真的很喜欢其中一种涉及选择和聚合方法的解决方案(来自Fernando Nicolet)。 Linq非常强大,并且花了一些时间来应对这样的挑战让你学到很多东西。 我扭曲了一下,所以它更高效和清晰(通过使用Count并避免反向):

public static string CommaQuibbling(IEnumerable items) { int last = items.Count() - 1; Func getSeparator = (i) => i == 0 ? string.Empty : (i == last ? " and " : ", "); string answer = string.Empty; return "{" + items.Select((s, i) => new { Index = i, Value = s }) .Aggregate(answer, (s, a) => s + getSeparator(a.Index) + a.Value) + "}"; } 

这种方法怎么样? 纯粹累积 – 没有回溯,只迭代一次。 对于原始性能,我不确定你会用LINQ等做得更好,无论LINQ的答案是多么“漂亮”。

 using System; using System.Collections.Generic; using System.Text; static class Program { public static string CommaQuibbling(IEnumerable items) { StringBuilder sb = new StringBuilder('{'); using (var iter = items.GetEnumerator()) { if (iter.MoveNext()) { // first item can be appended directly sb.Append(iter.Current); if (iter.MoveNext()) { // more than one; only add each // term when we know there is another string lastItem = iter.Current; while (iter.MoveNext()) { // middle term; use ", " sb.Append(", ").Append(lastItem); lastItem = iter.Current; } // add the final term; since we are on at least the // second term, always use " and " sb.Append(" and ").Append(lastItem); } } } return sb.Append('}').ToString(); } static void Main() { Console.WriteLine(CommaQuibbling(new string[] { })); Console.WriteLine(CommaQuibbling(new string[] { "ABC" })); Console.WriteLine(CommaQuibbling(new string[] { "ABC", "DEF" })); Console.WriteLine(CommaQuibbling(new string[] { "ABC", "DEF", "G", "H" })); } } 

效率不高,但我想清楚。

 public static string CommaQuibbling(IEnumerable items) { List list = new List(items); if (list.Count == 0) { return "{}"; } if (list.Count == 1) { return "{" + list[0] + "}"; } String[] initial = list.GetRange(0, list.Count - 1).ToArray(); return "{" + String.Join(", ", initial) + " and " + list[list.Count - 1] + "}"; } 

如果我维护代码,我更喜欢这个更聪明的版本。

如果我在使用需要第一个/最后一个信息的流做了很多事情,我会有一个扩展名:

 [Flags] public enum StreamPosition { First = 1, Last = 2 } public static IEnumerable MapWithPositions (this IEnumerable stream, Func map) { using (var enumerator = stream.GetEnumerator ()) { if (!enumerator.MoveNext ()) yield break ; var cur = enumerator.Current ; var flags = StreamPosition.First ; while (true) { if (!enumerator.MoveNext ()) flags |= StreamPosition.Last ; yield return map (flags, cur) ; if ((flags & StreamPosition.Last) != 0) yield break ; cur = enumerator.Current ; flags = 0 ; } } } 

那么最简单的(不是最快的,需要一些更方便的扩展方法)解决方案将是:

 public static string Quibble (IEnumerable strings) { return "{" + String.Join ("", strings.MapWithPositions ((pos, item) => ( (pos & StreamPosition.First) != 0 ? "" : pos == StreamPosition.Last ? " and " : ", ") + item)) + "}" ; } 

这里作为Python一个class轮

 >>> f=lambda s:"{%s}"%", ".join(s)[::-1].replace(',','dna ',1)[::-1] >>> f([]) '{}' >>> f(["ABC"]) '{ABC}' >>> f(["ABC","DEF"]) '{ABC and DEF}' >>> f(["ABC","DEF","G","H"]) '{ABC, DEF, G and H}' 

这个版本可能更容易理解

 >>> f=lambda s:"{%s}"%" and ".join(s).replace(' and',',',len(s)-2) >>> f([]) '{}' >>> f(["ABC"]) '{ABC}' >>> f(["ABC","DEF"]) '{ABC and DEF}' >>> f(["ABC","DEF","G","H"]) '{ABC, DEF, G and H}' 

这是一个简单的F#解决方案,只进行一次前向迭代:

 let CommaQuibble items = let sb = System.Text.StringBuilder("{") // pp is 2 previous, p is previous let pp,p = items |> Seq.fold (fun (pp:string option,p) s -> if pp <> None then sb.Append(pp.Value).Append(", ") |> ignore (p, Some(s))) (None,None) if pp <> None then sb.Append(pp.Value).Append(" and ") |> ignore if p <> None then sb.Append(p.Value) |> ignore sb.Append("}").ToString() 

(编辑:事实certificate这与Skeet的非常相似。)

测试代码:

 let Test l = printfn "%s" (CommaQuibble l) Test [] Test ["ABC"] Test ["ABC";"DEF"] Test ["ABC";"DEF";"G"] Test ["ABC";"DEF";"G";"H"] Test ["ABC";null;"G";"H"] 

我是连续逗号的粉丝:我吃饭,开枪,离开。

我一直需要一个解决这个问题的方法,并用3种语言解决了它(虽然不是C#)。 我将通过编写适用于任何IEnumerableconcat方法来调整以下解决方案(在Lua中 ,不用花括号包装答案):

 function commafy(t, andword) andword = andword or 'and' local n = #t -- number of elements in the numeration if n == 1 then return t[1] elseif n == 2 then return concat { t[1], ' ', andword, ' ', t[2] } else local last = t[n] t[n] = andword .. ' ' .. t[n] local answer = concat(t, ', ') t[n] = last return answer end end 

这不是很出色,但它可以很好地扩展到数千万字符串。 我正在使用旧的Pentium 4工作站进行开发,它在大约350毫秒内完成1,000,000个平均长度为8的字符串。

 public static string CreateLippertString(IEnumerable strings) { char[] combinedString; char[] commaSeparator = new char[] { ',', ' ' }; char[] andSeparator = new char[] { ' ', 'A', 'N', 'D', ' ' }; int totalLength = 2; //'{' and '}' int numEntries = 0; int currentEntry = 0; int currentPosition = 0; int secondToLast; int last; int commaLength= commaSeparator.Length; int andLength = andSeparator.Length; int cbComma = commaLength * sizeof(char); int cbAnd = andLength * sizeof(char); //calculate the sum of the lengths of the strings foreach (string s in strings) { totalLength += s.Length; ++numEntries; } //add to the total length the length of the constant characters if (numEntries >= 2) totalLength += 5; // " AND " if (numEntries > 2) totalLength += (2 * (numEntries - 2)); // ", " between items //setup some meta-variables to help later secondToLast = numEntries - 2; last = numEntries - 1; //allocate the memory for the combined string combinedString = new char[totalLength]; //set the first character to { combinedString[0] = '{'; currentPosition = 1; if (numEntries > 0) { //now copy each string into its place foreach (string s in strings) { Buffer.BlockCopy(s.ToCharArray(), 0, combinedString, currentPosition * sizeof(char), s.Length * sizeof(char)); currentPosition += s.Length; if (currentEntry == secondToLast) { Buffer.BlockCopy(andSeparator, 0, combinedString, currentPosition * sizeof(char), cbAnd); currentPosition += andLength; } else if (currentEntry == last) { combinedString[currentPosition] = '}'; //set the last character to '}' break; //don't bother making that last call to the enumerator } else if (currentEntry < secondToLast) { Buffer.BlockCopy(commaSeparator, 0, combinedString, currentPosition * sizeof(char), cbComma); currentPosition += commaLength; } ++currentEntry; } } else { //set the last character to '}' combinedString[1] = '}'; } return new string(combinedString); } 

另一个变体 – 为了代码清晰,分离标点符号和迭代逻辑。 并且还在考虑性能。

按照纯IEnumerable / string /的要求工作,列表中的字符串不能为空。

 public static string Concat(IEnumerable strings) { return "{" + strings.reduce("", (acc, prev, cur, next) => acc.Append(punctuation(prev, cur, next)).Append(cur)) + "}"; } private static string punctuation(string prev, string cur, string next) { if (null == prev || null == cur) return ""; if (null == next) return " and "; return ", "; } private static string reduce(this IEnumerable strings, string acc, Func func) { if (null == strings) return ""; var accumulatorBuilder = new StringBuilder(acc); string cur = null; string prev = null; foreach (var next in strings) { func(accumulatorBuilder, prev, cur, next); prev = cur; cur = next; } func(accumulatorBuilder, prev, cur, null); return accumulatorBuilder.ToString(); } 

F#肯定看起来好多了:

 let rec reduce list = match list with | [] -> "" | head::curr::[] -> head + " and " + curr | head::curr::tail -> head + ", " + curr :: tail |> reduce | head::[] -> head let concat list = "{" + (list |> reduce ) + "}" 

免责声明 :我以此为借口来使用新技术,因此我的解决方案并没有真正符合Eric对清晰度和可维护性的原始要求。

天真的枚举器解决方案 (我承认这种方法的foreach变体是优越的,因为它不需要手动弄乱枚举器。)

 public static string NaiveConcatenate(IEnumerable sequence) { StringBuilder sb = new StringBuilder(); sb.Append('{'); IEnumerator enumerator = sequence.GetEnumerator(); if (enumerator.MoveNext()) { string a = enumerator.Current; if (!enumerator.MoveNext()) { sb.Append(a); } else { string b = enumerator.Current; while (enumerator.MoveNext()) { sb.Append(a); sb.Append(", "); a = b; b = enumerator.Current; } sb.AppendFormat("{0} and {1}", a, b); } } sb.Append('}'); return sb.ToString(); } 

使用LINQ的解决方案

 public static string ConcatenateWithLinq(IEnumerable sequence) { return (from item in sequence select item) .Aggregate( new {sb = new StringBuilder("{"), a = (string) null, b = (string) null}, (s, x) => { if (sa != null) { s.sb.Append(sa); s.sb.Append(", "); } return new {s.sb, a = sb, b = x}; }, (s) => { if (sb != null) if (sa != null) s.sb.AppendFormat("{0} and {1}", sa, sb); else s.sb.Append(sb); s.sb.Append("}"); return s.sb.ToString(); }); } 

使用TPL解决方案

此解决方案使用生产者 – 消费者队列将输入序列提供给处理器,同时保持队列中至少缓冲两个元素。 一旦生产者到达输入序列的末尾,可以使用特殊处理来处理最后两个元素。

事后看来,没有理由让消费者异步操作,这将消除对并发队列的需求,但正如我之前所说,我只是以此为借口来玩新技术:-)

 public static string ConcatenateWithTpl(IEnumerable sequence) { var queue = new ConcurrentQueue(); bool stop = false; var consumer = Future.Create( () => { var sb = new StringBuilder("{"); while (!stop || queue.Count > 2) { string s; if (queue.Count > 2 && queue.TryDequeue(out s)) sb.AppendFormat("{0}, ", s); } return sb; }); // Producer foreach (var item in sequence) queue.Enqueue(item); stop = true; StringBuilder result = consumer.Value; string a; string b; if (queue.TryDequeue(out a)) if (queue.TryDequeue(out b)) result.AppendFormat("{0} and {1}", a, b); else result.Append(a); result.Append("}"); return result.ToString(); } 

unit testing为简洁而省略。

迟入:

 public static string CommaQuibbling(IEnumerable items) { string[] parts = items.ToArray(); StringBuilder result = new StringBuilder('{'); for (int i = 0; i < parts.Length; i++) { if (i > 0) result.Append(i == parts.Length - 1 ? " and " : ", "); result.Append(parts[i]); } return result.Append('}').ToString(); } 
 public static string CommaQuibbling(IEnumerable items) { int count = items.Count(); string answer = string.Empty; return "{" + (count==0) ? "" : ( items[0] + (count == 1 ? "" : items.Range(1,count-1). Aggregate(answer, (s,a)=> s += ", " + a) + items.Range(count-1,1). Aggregate(answer, (s,a)=> s += " AND " + a) ))+ "}"; } 

它被实现为,

 if count == 0 , then return empty, if count == 1 , then return only element, if count > 1 , then take two ranges, first 2nd element to 2nd last element last element 

这是我的,但我意识到它非常像Marc,在事物的顺序上有一些细微的差别,我也添加了unit testing。

 using System; using NUnit.Framework; using NUnit.Framework.Extensions; using System.Collections.Generic; using System.Text; using NUnit.Framework.SyntaxHelpers; namespace StringChallengeProject { [TestFixture] public class StringChallenge { [RowTest] [Row(new String[] { }, "{}")] [Row(new[] { "ABC" }, "{ABC}")] [Row(new[] { "ABC", "DEF" }, "{ABC and DEF}")] [Row(new[] { "ABC", "DEF", "G", "H" }, "{ABC, DEF, G and H}")] public void Test(String[] input, String expectedOutput) { Assert.That(FormatString(input), Is.EqualTo(expectedOutput)); } //codesnippet:93458590-3182-11de-8c30-0800200c9a66 public static String FormatString(IEnumerable input) { if (input == null) return "{}"; using (var iterator = input.GetEnumerator()) { // Guard-clause for empty source if (!iterator.MoveNext()) return "{}"; // Take care of first value var output = new StringBuilder(); output.Append('{').Append(iterator.Current); // Grab next if (iterator.MoveNext()) { // Grab the next value, but don't process it // we don't know whether to use comma or "and" // until we've grabbed the next after it as well String nextValue = iterator.Current; while (iterator.MoveNext()) { output.Append(", "); output.Append(nextValue); nextValue = iterator.Current; } output.Append(" and "); output.Append(nextValue); } output.Append('}'); return output.ToString(); } } } } 

如何在构建它之后跳过复杂的聚合代码并清理字符串?

 public static string CommaQuibbling(IEnumerable items) { var aggregate = items.Aggregate( new StringBuilder(), (b,s) => b.AppendFormat(", {0}", s)); var trimmed = Regex.Replace(aggregate.ToString(), "^, ", string.Empty); return string.Format( "{{{0}}}", Regex.Replace(trimmed, ", (?[^,]*)$", @" and ${last}")); } 

更新:这不适用于带逗号的字符串,如评论中所指出的。 我尝试了其他一些变体,但是如果没有关于字符串可以包含的内容的明确规则,我将会遇到真正的问题,将任何可能的最后一项与正则表达式匹配,这使我对它的局限性有了很好的教训。

我非常喜欢Jon的回答,但这是因为它与我解决问题的方式非常相似。 我没有在两个变量中专门编码,而是在FIFO队列中实现它们。

这很奇怪,因为我只假设有15个post都做了完全相同的事情,但看起来我们是唯一两个这样做的post。 哦,看看这些答案,Marc Gravell的答案与我们使用的方法非常接近,但他使用了两个“循环”,而不是坚持价值观。

但是所有那些使用LINQ和正则表达式以及加入数组的答案看起来都像疯了! 🙂

我不认为使用一个好的旧数组是一个限制。 这是我使用数组和扩展方法的版本:

 public static string CommaQuibbling(IEnumerable list) { string[] array = list.ToArray(); if (array.Length == 0) return string.Empty.PutCurlyBraces(); if (array.Length == 1) return array[0].PutCurlyBraces(); string allExceptLast = string.Join(", ", array, 0, array.Length - 1); string theLast = array[array.Length - 1]; return string.Format("{0} and {1}", allExceptLast, theLast) .PutCurlyBraces(); } public static string PutCurlyBraces(this string str) { return "{" + str + "}"; } 

我使用数组是因为string.Join方法,因为如果有可能通过索引访问最后一个元素。 由于DRY,扩展方法在这里。

我认为性能list.ToArray()来自list.ToArray()string.Join调用,但我希望这段代码能够阅读和维护。

我认为Linq提供了相当可读的代码。 这个版本在0.89秒内处理了一百万个“ABC”:

 using System.Collections.Generic; using System.Linq; namespace CommaQuibbling { internal class Translator { public string Translate(IEnumerable items) { return "{" + Join(items) + "}"; } private static string Join(IEnumerable items) { var leadingItems = LeadingItemsFrom(items); var lastItem = LastItemFrom(items); return JoinLeading(leadingItems) + lastItem; } private static IEnumerable LeadingItemsFrom(IEnumerable items) { return items.Reverse().Skip(1).Reverse(); } private static string LastItemFrom(IEnumerable items) { return items.LastOrDefault(); } private static string JoinLeading(IEnumerable items) { if (items.Any() == false) return ""; return string.Join(", ", items.ToArray()) + " and "; } } } 

您可以使用foreach,不使用LINQ,委托,闭包,列表或数组,并且仍然可以使用可理解的代码。 使用bool和字符串,如下所示:

 public static string CommaQuibbling(IEnumerable items) { StringBuilder sb = new StringBuilder("{"); bool empty = true; string prev = null; foreach (string s in items) { if (prev!=null) { if (!empty) sb.Append(", "); else empty = false; sb.Append(prev); } prev = s; } if (prev!=null) { if (!empty) sb.Append(" and "); sb.Append(prev); } return sb.Append('}').ToString(); } 
 public static string CommaQuibbling(IEnumerable items) { var itemArray = items.ToArray(); var commaSeparated = String.Join(", ", itemArray, 0, Math.Max(itemArray.Length - 1, 0)); if (commaSeparated.Length > 0) commaSeparated += " and "; return "{" + commaSeparated + itemArray.LastOrDefault() + "}"; } 

这是我的提交。 稍微修改了签名以使其更通用。 使用.NET 4function(使用IEnumerable String.Join() ),否则使用.NET 3.5。 目标是使用LINQ和极其简化的逻辑。

 static string CommaQuibbling(IEnumerable items) { int count = items.Count(); var quibbled = items.Select((Item, index) => new { Item, Group = (count - index - 2) > 0}) .GroupBy(item => item.Group, item => item.Item) .Select(g => g.Key ? String.Join(", ", g) : String.Join(" and ", g)); return "{" + String.Join(", ", quibbled) + "}"; } 

有几个非C#的答案,原来的post确实要求任何语言的答案,所以我想我会展示另一种方式,没有一个C#程序员似乎已经触及过:DSL!

 (defun quibble-comma (words) (format nil "~{~#[~;~a~;~a and ~a~:;~@{~a~#[~; and ~:;, ~]~}~]~}" words)) 

精明的人会注意到Common Lisp并没有内置的IEnumerable ,因此这里的FORMAT只能在正确的列表中工作。 但是如果你创建了一个IEnumerable ,你肯定可以扩展FORMAT来处理它。 (Clojure有这个吗?)

另外,任何读过这个有味道的人 (包括Lisp程序员!)都可能被字面意思冒犯"~{~#[~;~a~;~a and ~a~:;~@{~a~#[~; and ~:;, ~]~}~]~}"那里。 我不会声称FORMAT实现了一个好的 DSL,但我相信拥有一些强大的DSL来将字符串放在一起非常有用。 正则表达式是一个强大的DSL,用于撕开字符串,而string.Format是一种用于将字符串放在一起的DSL(类型),但它的愚蠢弱点。

我想每个人都会一直写这些东西。 为什么这里没有一些内置的通用雅致DSL呢? 我认为我们最接近的是“Perl”,也许。

只是为了好玩,使用C#4.0中新的Zip扩展方法:

 private static string CommaQuibbling(IEnumerable list) { IEnumerable separators = GetSeparators(list.Count()); var finalList = list.Zip(separators, (w, s) => w + s); return string.Concat("{", string.Join(string.Empty, finalList), "}"); } private static IEnumerable GetSeparators(int itemCount) { while (itemCount-- > 2) yield return ", "; if (itemCount == 1) yield return " and "; yield return string.Empty; } 
 return String.Concat( "{", input.Length > 2 ? String.Concat( String.Join(", ", input.Take(input.Length - 1)), " and ", input.Last()) : String.Join(" and ", input), "}"); 

我尝试过使用foreach。 请让我知道你的意见。

 private static string CommaQuibble(IEnumerable input) { var val = string.Concat(input.Process( p => p, p => string.Format(" and {0}", p), p => string.Format(", {0}", p))); return string.Format("{{{0}}}", val); } public static IEnumerable Process(this IEnumerable input, Func firstItemFunc, Func lastItemFunc, Func otherItemFunc) { //break on empty sequence if (!input.Any()) yield break; //return first elem var first = input.First(); yield return firstItemFunc(first); //break if there was only one elem var rest = input.Skip(1); if (!rest.Any()) yield break; //start looping the rest of the elements T prevItem = first; bool isFirstIteration = true; foreach (var item in rest) { if (isFirstIteration) isFirstIteration = false; else { yield return otherItemFunc(prevItem); } prevItem = item; } //last element yield return lastItemFunc(prevItem); } 

以下是基于http://blogs.perl.org/users/brian_d_foy/2013/10/comma-quibbling-in-perl.html上的回复,用Perl编写的一些解决方案和测试代码。

 #!/usr/bin/perl use 5.14.0; use warnings; use strict; use Test::More qw{no_plan}; sub comma_quibbling1 { my (@words) = @_; return "" unless @words; return $words[0] if @words == 1; return join(", ", @words[0 .. $#words - 1]) . " and $words[-1]"; } sub comma_quibbling2 { return "" unless @_; my $last = pop @_; return $last unless @_; return join(", ", @_) . " and $last"; } is comma_quibbling1(qw{}), "", "1-0"; is comma_quibbling1(qw{one}), "one", "1-1"; is comma_quibbling1(qw{one two}), "one and two", "1-2"; is comma_quibbling1(qw{one two three}), "one, two and three", "1-3"; is comma_quibbling1(qw{one two three four}), "one, two, three and four", "1-4"; is comma_quibbling2(qw{}), "", "2-0"; is comma_quibbling2(qw{one}), "one", "2-1"; is comma_quibbling2(qw{one two}), "one and two", "2-2"; is comma_quibbling2(qw{one two three}), "one, two and three", "2-3"; is comma_quibbling2(qw{one two three four}), "one, two, three and four", "2-4"; 

距离上一篇文章还有不到十年,所以这是我的变化:

  public static string CommaQuibbling(IEnumerable items) { var text = new StringBuilder(); string sep = null; int last_pos = items.Count(); int next_pos = 1; foreach(string item in items) { text.Append($"{sep}{item}"); sep = ++next_pos < last_pos ? ", " : " and "; } return $"{{{text}}}"; }