删除int数组列表中的重复项

有一个int数组列表,如:

List intArrList = new List(); intArrList.Add(new int[3] { 0, 0, 0 }); intArrList.Add(new int[5] { 20, 30, 10, 4, 6 }); //this intArrList.Add(new int[3] { 1, 2, 5 }); intArrList.Add(new int[5] { 20, 30, 10, 4, 6 }); //this intArrList.Add(new int[3] { 12, 22, 54 }); intArrList.Add(new int[5] { 1, 2, 6, 7, 8 }); intArrList.Add(new int[4] { 0, 0, 0, 0 }); 

你将如何删除重复(重复我的意思是列表的元素具有相同的长度和相同的数字)。

在示例中,我将删除元素{ 20, 30, 10, 4, 6 }因为它被发现两次

我正在考虑按元素大小对列表进行排序,然后将每个元素循环反对rest,但我不知道该怎么做。

其他问题是,如果使用像哈希这样的其他结构会更好……如果是这样如何使用它?

使用GroupBy

 var result = intArrList.GroupBy(c => String.Join(",", c)) .Select(c => c.First().ToList()).ToList(); 

结果:

{0,0,0}

{20,30,10,4,6}

{1,2,5}

{12,22,54}

{1,2,6,7,8}

{0,0,0,0}

编辑 :如果你想{1,2,3,4}等于{2,3,4,1}你需要像这样使用OrderBy

 var result = intArrList.GroupBy(p => string.Join(", ", p.OrderBy(c => c))) .Select(c => c.First().ToList()).ToList(); 

编辑2 :为了帮助理解LINQ GroupBy解决方案的工作原理,请考虑以下方法:

 public List FindDistinctWithoutLinq(List lst) { var dic = new Dictionary(); foreach (var item in lst) { string key = string.Join(",", item.OrderBy(c=>c)); if (!dic.ContainsKey(key)) { dic.Add(key, item); } } return dic.Values.ToList(); } 

您可以定义自己的IEqualityComparer实现,并将其与IEnumerable.Distinct一起使用:

 class MyComparer : IEqualityComparer { public int GetHashCode(int[] instance) { return 0; } // TODO: better HashCode for arrays public bool Equals(int[] instance, int[] other) { if (other == null || instance == null || instance.Length != other.Length) return false; return instance.SequenceEqual(other); } } 

现在编写此代码以仅获取列表的不同值:

 var result = intArrList.Distinct(new MyComparer()); 

但是,如果您想要不同的排列,也应该以这种方式实现比较器:

 public bool Equals(int[] instance, int[] other) { if (ReferenceEquals(instance, other)) return true; // this will return true when both arrays are NULL if (other == null || instance == null) return false; return instance.All(x => other.Contains(x)) && other.All(x => instance.Contains(x)); } 

编辑:为了更好的GetashCode实现,你可以看看这篇文章 ,也在@ Mick的回答中提出。

从这里和这里提升代码。 一个更通用的GetHashCode实现会使这更通用,但我相信下面的实现是最强大的

 class Program { static void Main(string[] args) { List intArrList = new List(); intArrList.Add(new int[3] { 0, 0, 0 }); intArrList.Add(new int[5] { 20, 30, 10, 4, 6 }); //this intArrList.Add(new int[3] { 1, 2, 5 }); intArrList.Add(new int[5] { 20, 30, 10, 4, 6 }); //this intArrList.Add(new int[3] { 12, 22, 54 }); intArrList.Add(new int[5] { 1, 2, 6, 7, 8 }); intArrList.Add(new int[4] { 0, 0, 0, 0 }); var test = intArrList.Distinct(new IntArrayEqualityComparer()); Console.WriteLine(test.Count()); Console.WriteLine(intArrList.Count()); } public class IntArrayEqualityComparer : IEqualityComparer { public bool Equals(int[] x, int[] y) { return ArraysEqual(x, y); } public int GetHashCode(int[] obj) { int hc = obj.Length; for (int i = 0; i < obj.Length; ++i) { hc = unchecked(hc * 17 + obj[i]); } return hc; } static bool ArraysEqual(T[] a1, T[] a2) { if (ReferenceEquals(a1, a2)) return true; if (a1 == null || a2 == null) return false; if (a1.Length != a2.Length) return false; EqualityComparer comparer = EqualityComparer.Default; for (int i = 0; i < a1.Length; i++) { if (!comparer.Equals(a1[i], a2[i])) return false; } return true; } } } 

编辑: IEqualityComparer的通用实现,适用于任何类型的数组: -

 public class ArrayEqualityComparer : IEqualityComparer { public bool Equals(T[] x, T[] y) { if (ReferenceEquals(x, y)) return true; if (x == null || y == null) return false; if (x.Length != y.Length) return false; EqualityComparer comparer = EqualityComparer.Default; for (int i = 0; i < x.Length; i++) { if (!comparer.Equals(x[i], y[i])) return false; } return true; } public int GetHashCode(T[] obj) { int hc = obj.Length; for (int i = 0; i < obj.Length; ++i) { hc = unchecked(hc * 17 + obj[i].GetHashCode()); } return hc; } } 

Edit2 :如果数组内的整数排序无关紧要,我会的

 var test = intArrList.Select(a => a.OrderBy(e => e).ToArray()).Distinct(comparer).ToList(); 
 List CopyString1 = new List(); CopyString1.AddRange(intArrList); List CopyString2 = new List(); CopyString2.AddRange(intArrList); for (int i = 0; i < CopyString2.Count(); i++) { for (int j = i; j < CopyString1.Count(); j++) { if (i != j && CopyString2[i].Count() == CopyString1[j].Count()) { var cnt = 0; for (int k = 0; k < CopyString2[i].Count(); k++) { if (CopyString2[i][k] == CopyString1[j][k]) cnt++; else break; } if (cnt == CopyString2[i].Count()) intArrList.RemoveAt(i); } } } 

使用BenchmarkDotNet比较@ S.Akbari和@ Mick的解决方案

编辑:

SAkbari_FindDistinctWithoutLinq对ContainsKey进行了冗余调用,所以我添加了更快的版本:SAkbari_FindDistinctWithoutLinq2

                           方法| 意思是| 错误|  StdDev |
 --------------------------------- | ---------:| ----- -----:| ----------:|
   SAkbari_FindDistinctWithoutLinq |  4.021我们|  0.0723我们|  0.0676 us |
  SAkbari_FindDistinctWithoutLinq2 |  3.930我们|  0.0529我们|  0.0495我们​​|
          SAkbari_FindDistinctLinq |  5.597我们|  0.0264我们|  0.0234我们|
             Mick_UsingGetHashCode |  6.339我们|  0.0265我们|  0.0248我们|
 BenchmarkDotNet = v0.10.13,OS = Windows 10 Redstone 3 [1709,Fall Creators Update](10.0.16299.248)
英特尔酷睿i7-7700 CPU 3.60GHz(Kaby Lake),1个CPU,8个逻辑内核和4个物理内核
频率= 3515625 Hz,分辨率= 284.4444 ns,定时器= TSC
 .NET Core SDK = 2.1.100
   [主持人]:.NET Core 2.0.5(CoreCLR 4.6.26020.03,CoreFX 4.6.26018.01),64位RyuJIT
   DefaultJob:.NET Core 2.0.5(CoreCLR 4.6.26020.03,CoreFX 4.6.26018.01),64位RyuJIT

基准测试:

 using BenchmarkDotNet.Attributes; using BenchmarkDotNet.Running; using System; using System.Collections.Generic; using System.Linq; namespace ConsoleApp1 { public class Program { List intArrList = new List { new int[] { 0, 0, 0 }, new int[] { 20, 30, 10, 4, 6 }, //this new int[] { 1, 2, 5 }, new int[] { 20, 30, 10, 4, 6 }, //this new int[] { 12, 22, 54 }, new int[] { 1, 2, 6, 7, 8 }, new int[] { 0, 0, 0, 0 } }; [Benchmark] public List SAkbari_FindDistinctWithoutLinq() => FindDistinctWithoutLinq(intArrList); [Benchmark] public List SAkbari_FindDistinctWithoutLinq2() => FindDistinctWithoutLinq2(intArrList); [Benchmark] public List SAkbari_FindDistinctLinq() => FindDistinctLinq(intArrList); [Benchmark] public List Mick_UsingGetHashCode() => FindDistinctLinq(intArrList); static void Main(string[] args) { var summary = BenchmarkRunner.Run(); } public static List FindDistinctWithoutLinq(List lst) { var dic = new Dictionary(); foreach (var item in lst) { string key = string.Join(",", item.OrderBy(c => c)); if (!dic.ContainsKey(key)) { dic.Add(key, item); } } return dic.Values.ToList(); } public static List FindDistinctWithoutLinq2(List lst) { var dic = new Dictionary(); foreach (var item in lst) dic.TryAdd(string.Join(",", item.OrderBy(c => c)), item); return dic.Values.ToList(); } public static List FindDistinctLinq(List lst) { return lst.GroupBy(p => string.Join(", ", p.OrderBy(c => c))) .Select(c => c.First().ToArray()).ToList(); } public static List UsingGetHashCode(List lst) { return lst.Select(a => a.OrderBy(e => e).ToArray()).Distinct(new IntArrayEqualityComparer()).ToList(); } } public class IntArrayEqualityComparer : IEqualityComparer { public bool Equals(int[] x, int[] y) { return ArraysEqual(x, y); } public int GetHashCode(int[] obj) { int hc = obj.Length; for (int i = 0; i < obj.Length; ++i) { hc = unchecked(hc * 17 + obj[i]); } return hc; } static bool ArraysEqual(T[] a1, T[] a2) { if (ReferenceEquals(a1, a2)) return true; if (a1 == null || a2 == null) return false; if (a1.Length != a2.Length) return false; EqualityComparer comparer = EqualityComparer.Default; for (int i = 0; i < a1.Length; i++) { if (!comparer.Equals(a1[i], a2[i])) return false; } return true; } } } 

输入清单;

 List> initList = new List>(); initList.Add(new List{ 0, 0, 0 }); initList.Add(new List{ 20, 30, 10, 4, 6 }); //this initList.Add(new List { 1, 2, 5 }); initList.Add(new List { 20, 30, 10, 4, 6 }); //this initList.Add(new List { 12, 22, 54 }); initList.Add(new List { 1, 2, 6, 7, 8 }); initList.Add(new List { 0, 0, 0, 0 }); 

您可以创建结果列表,在添加元素之前,您可以检查它是否已添加。 我只是比较列表计数并使用p.Except(item).Any()调用来检查列表是否包含该元素。

 List> returnList = new List>(); foreach (var item in initList) { if (returnList.Where(p => !p.Except(item).Any() && !item.Except(p).Any() && p.Count() == item.Count() ).Count() == 0) returnList.Add(item); } 

您可以使用HashSet。 HashSet是用于保证唯一性的集合,您可以比较集合,Intersect,Union上的项目。 等等

优点:没有重复,易于操作数据组,更有效缺点:您无法获取集合中的特定项目,例如:list [0]不适用于HashSet。 您只能枚举项目。 例如foreach

这是一个例子:

 using System; using System.Collections.Generic; namespace ConsoleApp2 { class Program { static void Main(string[] args) { HashSet> intArrList = new HashSet>(new HashSetIntComparer()); intArrList.Add(new HashSet(3) { 0, 0, 0 }); intArrList.Add(new HashSet(5) { 20, 30, 10, 4, 6 }); //this intArrList.Add(new HashSet(3) { 1, 2, 5 }); intArrList.Add(new HashSet(5) { 20, 30, 10, 4, 6 }); //this intArrList.Add(new HashSet(3) { 12, 22, 54 }); intArrList.Add(new HashSet(5) { 1, 2, 6, 7, 8 }); intArrList.Add(new HashSet(4) { 0, 0, 0, 0 }); // Checking the output foreach (var item in intArrList) { foreach (var subHasSet in item) { Console.Write("{0} ", subHasSet); } Console.WriteLine(); } Console.Read(); } private class HashSetIntComparer : IEqualityComparer> { public bool Equals(HashSet x, HashSet y) { // SetEquals does't set anything. It's a method for compare the contents of the HashSet. // Such a poor name from .Net return x.SetEquals(y); } public int GetHashCode(HashSet obj) { //TODO: implemente a better HashCode return base.GetHashCode(); } } } } Output: 0 20 30 10 4 6 1 2 5 12 22 54 1 2 6 7 8 

注意:由于0重复多次,因此HashSet仅将0视为0。 如果需要在0 0 0 0和0 0 0之间进行差异化,则可以将HashSet> for HashSet>替换HashSet> for HashSet>并将Comparer实现为List。

您可以使用此链接了解如何比较列表: https : //social.msdn.microsoft.com/Forums/en-US/2ff3016c-bd61-4fec-8f8c-7b6c070123fa/c-compare-two-lists-of -objects?论坛= csharplanguage

如果您想了解有关馆藏和数据类型的更多信息,本课程是学习它的理想场所: https : //app.pluralsight.com/player? course = ccsharp-collections&author = verson -robinson&name = ccsharp -collections-fundamentals-m9- 套夹&= 1&模式=活

使用MoreLINQ,使用DistinctBy可以非常简单。

 var result = intArrList.DistinctBy(x => string.Join(",", x)); 

类似于GroupBy的答案,如果你想区分不管订单只是在连接中的顺序。

 var result = intArrList.DistinctBy(x => string.Join(",", x.OrderBy(y => y))); 

编辑 :这是它的实现方式

 public static IEnumerable DistinctBy(this IEnumerable source, Func keySelector, IEqualityComparer comparer) { if (source == null) throw new ArgumentNullException(nameof(source)); if (keySelector == null) throw new ArgumentNullException(nameof(keySelector)); return _(); IEnumerable _() { var knownKeys = new HashSet(comparer); foreach (var element in source) { if (knownKeys.Add(keySelector(element))) yield return element; } } } 

所以,如果你不需要MoreLINQ,你可以使用这样的方法:

 private static IEnumerable GetUniqueArrays(IEnumerable source) { var knownKeys = new HashSet(); foreach (var element in source) { if (knownKeys.Add(string.Join(",", element))) yield return element; } }