LINQ查询 – 数据聚合(Group Adjacent)
我们来一个叫做Cls
的课:
public class Cls { public int SequenceNumber { get; set; } public int Value { get; set; } }
现在,让我们使用以下元素填充一些集合:
序列 数值 ======== ===== 1 9 2 9 3 15 4 15 5 15 6 30 7 9
我需要做的是枚举序列号并检查下一个元素是否具有相同的值。 如果是,则汇总值,因此,所需的输出如下:
序列序列 号码 从价值到价值 ======== ======== ===== 1 2 9 3 5 15 6 6 30 7 7 9
如何使用LINQ查询执行此操作?
您可以在修改后的版本中使用Linq的GroupBy
,只有当两个项目相邻时才能进行分组,然后很容易:
var result = classes .GroupAdjacent(c => c.Value) .Select(g => new { SequenceNumFrom = g.Min(c => c.SequenceNumber), SequenceNumTo = g.Max(c => c.SequenceNumber), Value = g.Key }); foreach (var x in result) Console.WriteLine("SequenceNumFrom:{0} SequenceNumTo:{1} Value:{2}", x.SequenceNumFrom, x.SequenceNumTo, x.Value);
DEMO
结果:
SequenceNumFrom:1 SequenceNumTo:2 Value:9 SequenceNumFrom:3 SequenceNumTo:5 Value:15 SequenceNumFrom:6 SequenceNumTo:6 Value:30 SequenceNumFrom:7 SequenceNumTo:7 Value:9
这是对相邻项目进行分组的扩展方法:
public static IEnumerable> GroupAdjacent( this IEnumerable source, Func keySelector) { TKey last = default(TKey); bool haveLast = false; List list = new List (); foreach (TSource s in source) { TKey k = keySelector(s); if (haveLast) { if (!k.Equals(last)) { yield return new GroupOfAdjacent(list, last); list = new List (); list.Add(s); last = k; } else { list.Add(s); last = k; } } else { list.Add(s); last = k; haveLast = true; } } if (haveLast) yield return new GroupOfAdjacent(list, last); } }
和使用的类:
public class GroupOfAdjacent : IEnumerable, IGrouping { public TKey Key { get; set; } private List GroupList { get; set; } System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator() { return ((System.Collections.Generic.IEnumerable )this).GetEnumerator(); } System.Collections.Generic.IEnumerator System.Collections.Generic.IEnumerable .GetEnumerator() { foreach (var s in GroupList) yield return s; } public GroupOfAdjacent(List source, TKey key) { GroupList = source; Key = key; } }
您可以使用此linq查询
演示
var values = (new[] { 9, 9, 15, 15, 15, 30, 9 }).Select((x, i) => new { x, i }); var query = from v in values let firstNonValue = values.Where(v2 => v2.i >= vi && v2.x != vx).FirstOrDefault() let grouping = firstNonValue == null ? int.MaxValue : firstNonValue.i group v by grouping into v select new { From = v.Min(y => yi) + 1, To = v.Max(y => yi) + 1, Value = v.Min(y => yx) };
MoreLinq提供开箱即用的此function
它被称为GroupAdjacent
并在IEnumerable
上实现为扩展方法:
根据指定的键选择器function对序列的相邻元素进行分组。
enumerable.GroupAdjacent(e => e.Key)
如果你不想引入额外的二进制Nuget包 ,甚至还有一个只包含该方法的Nuget“source” 包 。
该方法返回IEnumerable
,因此其输出可以与GroupBy
输出相同的方式处理。
你可以这样做:
var all = new [] { new Cls(1, 9) , new Cls(2, 9) , new Cls(3, 15) , new Cls(4, 15) , new Cls(5, 15) , new Cls(6, 30) , new Cls(7, 9) }; var f = all.First(); var res = all.Skip(1).Aggregate( new List {new Run {From = f.SequenceNumber, To = f.SequenceNumber, Value = f.Value} } , (p, v) => { if (v.Value == p.Last().Value) { p.Last().To = v.SequenceNumber; } else { p.Add(new Run {From = v.SequenceNumber, To = v.SequenceNumber, Value = v.Value}); } return p; }); foreach (var r in res) { Console.WriteLine("{0} - {1} : {2}", r.From, r.To, r.Value); }
我的想法是创造性地使用Aggregate
:从包含单个Run
的列表开始,检查我们在聚合的每个阶段(lambda中的if
语句)到目前为止所获得的列表的内容。 根据最后一个值,继续旧运行或启动新运行。
这是一个关于ideone的演示 。
我能够通过创建自定义扩展方法来完成它。
static class Extensions { internal static IEnumerable> GroupAdj(this IEnumerable enumerable) { Cls start = null; Cls end = null; int value = Int32.MinValue; foreach (Cls cls in enumerable) { if (start == null) { start = cls; end = cls; continue; } if (start.Value == cls.Value) { end = cls; continue; } yield return Tuple.Create(start.SequenceNumber, end.SequenceNumber, start.Value); start = cls; end = cls; } yield return Tuple.Create(start.SequenceNumber, end.SequenceNumber, start.Value); } }
这是实施:
static void Main() { List items = new List { new Cls { SequenceNumber = 1, Value = 9 }, new Cls { SequenceNumber = 2, Value = 9 }, new Cls { SequenceNumber = 3, Value = 15 }, new Cls { SequenceNumber = 4, Value = 15 }, new Cls { SequenceNumber = 5, Value = 15 }, new Cls { SequenceNumber = 6, Value = 30 }, new Cls { SequenceNumber = 7, Value = 9 } }; Console.WriteLine("From To Value"); Console.WriteLine("===== ===== ====="); foreach (var item in items.OrderBy(i => i.SequenceNumber).GroupAdj()) { Console.WriteLine("{0,-5} {1,-5} {2,-5}", item.Item1, item.Item2, item.Item3); } }
和预期的产量:
From To Value ===== ===== ===== 1 2 9 3 5 15 6 6 30 7 7 9
这是一个没有任何辅助方法的实现:
var grp = 0; var results = from i in input.Zip( input.Skip(1).Concat(new [] {input.Last ()}), (n1, n2) => Tuple.Create( n1, (n2.Value == n1.Value) ? grp : grp++ ) ) group i by i.Item2 into gp select new {SequenceNumFrom = gp.Min(x => x.Item1.SequenceNumber),SequenceNumTo = gp.Max(x => x.Item1.SequenceNumber), Value = gp.Min(x => x.Item1.Value)};
这个想法是:
- 跟踪您自己的分组指标,grp。
- 将集合中的每个项目加入集合中的下一个项目(通过Skip(1)和Zip)。
- 如果值匹配,则它们位于同一组中; 否则,增加grp以指示下一组的开始。
未经考验的黑魔法随之而来。 在这种情况下,命令式版本似乎更容易。
IEnumerable data = ...; var query = data .GroupBy(x => x.Value) .Select(g => new { Value = g.Key, Sequences = g .OrderBy(x => x.SequenceNumber) .Select((x,i) => new { x.SequenceNumber, OffsetSequenceNumber = x.SequenceNumber - i }) .GroupBy(x => x.OffsetSequenceNumber) .Select(g => g .Select(x => x.SequenceNumber) .OrderBy(x => x) .ToList()) .ToList() }) .SelectMany(x => x.Sequences .Select(s => new { First = s.First(), Last = s.Last(), x.Value })) .OrderBy(x => x.First) .ToList();