LINQ查询 – 数据聚合(Group Adjacent)

我们来一个叫做Cls的课:

 public class Cls { public int SequenceNumber { get; set; } public int Value { get; set; } } 

现在,让我们使用以下元素填充一些集合:

序列
数值
 ======== =====
 1 9
 2 9
 3 15
 4 15
 5 15
 6 30
 7 9

我需要做的是枚举序列号并检查下一个元素是否具有相同的值。 如果是,则汇总值,因此,所需的输出如下:

序列序列
号码
从价值到价值
 ======== ======== =====
 1 2 9
 3 5 15
 6 6 30
 7 7 9

如何使用LINQ查询执行此操作?

您可以在修改后的版本中使用Linq的GroupBy ,只有当两个项目相邻时才能进行分组,然后很容易:

 var result = classes .GroupAdjacent(c => c.Value) .Select(g => new { SequenceNumFrom = g.Min(c => c.SequenceNumber), SequenceNumTo = g.Max(c => c.SequenceNumber), Value = g.Key }); foreach (var x in result) Console.WriteLine("SequenceNumFrom:{0} SequenceNumTo:{1} Value:{2}", x.SequenceNumFrom, x.SequenceNumTo, x.Value); 

DEMO

结果:

 SequenceNumFrom:1 SequenceNumTo:2 Value:9 SequenceNumFrom:3 SequenceNumTo:5 Value:15 SequenceNumFrom:6 SequenceNumTo:6 Value:30 SequenceNumFrom:7 SequenceNumTo:7 Value:9 

这是对相邻项目进行分组的扩展方法:

 public static IEnumerable> GroupAdjacent( this IEnumerable source, Func keySelector) { TKey last = default(TKey); bool haveLast = false; List list = new List(); foreach (TSource s in source) { TKey k = keySelector(s); if (haveLast) { if (!k.Equals(last)) { yield return new GroupOfAdjacent(list, last); list = new List(); list.Add(s); last = k; } else { list.Add(s); last = k; } } else { list.Add(s); last = k; haveLast = true; } } if (haveLast) yield return new GroupOfAdjacent(list, last); } } 

和使用的类:

 public class GroupOfAdjacent : IEnumerable, IGrouping { public TKey Key { get; set; } private List GroupList { get; set; } System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator() { return ((System.Collections.Generic.IEnumerable)this).GetEnumerator(); } System.Collections.Generic.IEnumerator System.Collections.Generic.IEnumerable.GetEnumerator() { foreach (var s in GroupList) yield return s; } public GroupOfAdjacent(List source, TKey key) { GroupList = source; Key = key; } } 

您可以使用此linq查询

演示

 var values = (new[] { 9, 9, 15, 15, 15, 30, 9 }).Select((x, i) => new { x, i }); var query = from v in values let firstNonValue = values.Where(v2 => v2.i >= vi && v2.x != vx).FirstOrDefault() let grouping = firstNonValue == null ? int.MaxValue : firstNonValue.i group v by grouping into v select new { From = v.Min(y => yi) + 1, To = v.Max(y => yi) + 1, Value = v.Min(y => yx) }; 

MoreLinq提供开箱即用的此function

它被称为GroupAdjacent并在IEnumerable上实现为扩展方法:

根据指定的键选择器function对序列的相邻元素进行分组。

 enumerable.GroupAdjacent(e => e.Key) 

如果你不想引入额外的二进制Nuget包 ,甚至还有一个只包含该方法的Nuget“source” 包 。

该方法返回IEnumerable> ,因此其输出可以与GroupBy输出相同的方式处理。

你可以这样做:

 var all = new [] { new Cls(1, 9) , new Cls(2, 9) , new Cls(3, 15) , new Cls(4, 15) , new Cls(5, 15) , new Cls(6, 30) , new Cls(7, 9) }; var f = all.First(); var res = all.Skip(1).Aggregate( new List {new Run {From = f.SequenceNumber, To = f.SequenceNumber, Value = f.Value} } , (p, v) => { if (v.Value == p.Last().Value) { p.Last().To = v.SequenceNumber; } else { p.Add(new Run {From = v.SequenceNumber, To = v.SequenceNumber, Value = v.Value}); } return p; }); foreach (var r in res) { Console.WriteLine("{0} - {1} : {2}", r.From, r.To, r.Value); } 

我的想法是创造性地使用Aggregate :从包含单个Run的列表开始,检查我们在聚合的每个阶段(lambda中的if语句)到目前为止所获得的列表的内容。 根据最后一个值,继续旧运行或启动新运行。

这是一个关于ideone的演示 。

我能够通过创建自定义扩展方法来完成它。

 static class Extensions { internal static IEnumerable> GroupAdj(this IEnumerable enumerable) { Cls start = null; Cls end = null; int value = Int32.MinValue; foreach (Cls cls in enumerable) { if (start == null) { start = cls; end = cls; continue; } if (start.Value == cls.Value) { end = cls; continue; } yield return Tuple.Create(start.SequenceNumber, end.SequenceNumber, start.Value); start = cls; end = cls; } yield return Tuple.Create(start.SequenceNumber, end.SequenceNumber, start.Value); } } 

这是实施:

 static void Main() { List items = new List { new Cls { SequenceNumber = 1, Value = 9 }, new Cls { SequenceNumber = 2, Value = 9 }, new Cls { SequenceNumber = 3, Value = 15 }, new Cls { SequenceNumber = 4, Value = 15 }, new Cls { SequenceNumber = 5, Value = 15 }, new Cls { SequenceNumber = 6, Value = 30 }, new Cls { SequenceNumber = 7, Value = 9 } }; Console.WriteLine("From To Value"); Console.WriteLine("===== ===== ====="); foreach (var item in items.OrderBy(i => i.SequenceNumber).GroupAdj()) { Console.WriteLine("{0,-5} {1,-5} {2,-5}", item.Item1, item.Item2, item.Item3); } } 

和预期的产量:

 From To Value ===== ===== ===== 1 2 9 3 5 15 6 6 30 7 7 9 

这是一个没有任何辅助方法的实现:

 var grp = 0; var results = from i in input.Zip( input.Skip(1).Concat(new [] {input.Last ()}), (n1, n2) => Tuple.Create( n1, (n2.Value == n1.Value) ? grp : grp++ ) ) group i by i.Item2 into gp select new {SequenceNumFrom = gp.Min(x => x.Item1.SequenceNumber),SequenceNumTo = gp.Max(x => x.Item1.SequenceNumber), Value = gp.Min(x => x.Item1.Value)}; 

这个想法是:

  • 跟踪您自己的分组指标,grp。
  • 将集合中的每个项目加入集合中的下一个项目(通过Skip(1)和Zip)。
  • 如果值匹配,则它们位于同一组中; 否则,增加grp以指示下一组的开始。

未经考验的黑魔法随之而来。 在这种情况下,命令式版本似乎更容易。

 IEnumerable data = ...; var query = data .GroupBy(x => x.Value) .Select(g => new { Value = g.Key, Sequences = g .OrderBy(x => x.SequenceNumber) .Select((x,i) => new { x.SequenceNumber, OffsetSequenceNumber = x.SequenceNumber - i }) .GroupBy(x => x.OffsetSequenceNumber) .Select(g => g .Select(x => x.SequenceNumber) .OrderBy(x => x) .ToList()) .ToList() }) .SelectMany(x => x.Sequences .Select(s => new { First = s.First(), Last = s.Last(), x.Value })) .OrderBy(x => x.First) .ToList();