使用LINQ to Objects查找一个集合中与另一个集合不匹配的项目

我想在一个集合中找到与另一个集合不匹配的所有项目。 但是,这些集合的类型不同; 我想写一个lambda表达式来指定相等性。

一个LINQPad我正在尝试做的例子:

void Main() { var employees = new[] { new Employee { Id = 20, Name = "Bob" }, new Employee { Id = 10, Name = "Bill" }, new Employee { Id = 30, Name = "Frank" } }; var managers = new[] { new Manager { EmployeeId = 20 }, new Manager { EmployeeId = 30 } }; var nonManagers = from employee in employees where !(managers.Any(x => x.EmployeeId == employee.Id)) select employee; nonManagers.Dump(); // Based on cdonner's answer: var nonManagers2 = from employee in employees join manager in managers on employee.Id equals manager.EmployeeId into tempManagers from manager in tempManagers.DefaultIfEmpty() where manager == null select employee; nonManagers2.Dump(); // Based on Richard Hein's answer: var nonManagers3 = employees.Except( from employee in employees join manager in managers on employee.Id equals manager.EmployeeId select employee); nonManagers3.Dump(); } public class Employee { public int Id { get; set; } public string Name { get; set; } } public class Manager { public int EmployeeId { get; set; } } 

以上工作,将返回员工账单(#10)。 但它看起来并不优雅,而且对于较大的集合来说效率可能不高。 在SQL中,我可能会执行LEFT JOIN并查找第二个ID为NULL的项目。 在LINQ中执行此操作的最佳做​​法是什么?

编辑:更新以防止依赖于Id等于索引的解决方案。

编辑:添加了cdonner的解决方案 – 任何人都有更简单的东西?

编辑:在Richard Hein的答案中添加了一个变体,我目前最喜欢的。 感谢大家的一些优秀答案!

这与其他一些示例几乎相同,但代码较少:

 employees.Except(employees.Join(managers, e => e.Id, m => m.EmployeeId, (e, m) => e)); 

然而,它并不比employees.Where(e =>!managers.Any(m => m.EmployeeId == e.Id))或原始语法更简单。

  ///  /// This method returns items in a set that are not in /// another set of a different type ///  ///  ///  ///  ///  ///  ///  ///  ///  public static IEnumerable Except( this IEnumerable items, IEnumerable other, Func getItemKey, Func getOtherKey) { return from item in items join otherItem in other on getItemKey(item) equals getOtherKey(otherItem) into tempItems from temp in tempItems.DefaultIfEmpty() where ReferenceEquals(null, temp) || temp.Equals(default(TOther)) select item; } 

我不记得我在哪里找到这种方法。

          var nonManagers =(来自员工的e1)
                             选择e1)。除外(
                                   来自m经理人
                                   来自员工的e2
                                   其中m.EmployeeId == e2.Id
                                   选择e2);

 var nonmanagers = employees.Select(e => e.Id) .Except(managers.Select(m => m.EmployeeId)) .Select(id => employees.Single(e => e.Id == id)); 

这有点晚了(我知道)。

我正在考虑同样的问题,并考虑使用HashSet,因为该方向的各种性能提示。 @Skeet 与IEnumerable.Intersect()的多个列表的交集 – 并询问我的办公室和共识是HashSet会更快,更可读:

 HashSet managerIds = new HashSet(managers.Select(x => x.EmployeeId)); nonManagers4 = employees.Where(x => !managerIds.Contains(x.Id)).ToList(); 

然后我得到了一个更快的解决方案,使用本机数组来创建一个掩码类型的解决方案(本机数组查询的语法会让我不使用它们,除非出于极端的性能原因)。

为了给这个答案一点可靠,经过很长一段时间后我已经将你的linqpad程序和数据扩展到时间,这样你就可以比较现在的六个选项了:

 void Main() { var employees = new[] { new Employee { Id = 20, Name = "Bob" }, new Employee { Id = 10, Name = "Kirk NM" }, new Employee { Id = 48, Name = "Rick NM" }, new Employee { Id = 42, Name = "Dick" }, new Employee { Id = 43, Name = "Harry" }, new Employee { Id = 44, Name = "Joe" }, new Employee { Id = 45, Name = "Steve NM" }, new Employee { Id = 46, Name = "Jim NM" }, new Employee { Id = 30, Name = "Frank"}, new Employee { Id = 47, Name = "Dave NM" }, new Employee { Id = 49, Name = "Alex NM" }, new Employee { Id = 50, Name = "Phil NM" }, new Employee { Id = 51, Name = "Ed NM" }, new Employee { Id = 52, Name = "Ollie NM" }, new Employee { Id = 41, Name = "Bill" }, new Employee { Id = 53, Name = "John NM" }, new Employee { Id = 54, Name = "Simon NM" } }; var managers = new[] { new Manager { EmployeeId = 20 }, new Manager { EmployeeId = 30 }, new Manager { EmployeeId = 41 }, new Manager { EmployeeId = 42 }, new Manager { EmployeeId = 43 }, new Manager { EmployeeId = 44 } }; System.Diagnostics.Stopwatch watch1 = new System.Diagnostics.Stopwatch(); int max = 1000000; watch1.Start(); List nonManagers1 = new List(); foreach (var item in Enumerable.Range(1,max)) { nonManagers1 = (from employee in employees where !(managers.Any(x => x.EmployeeId == employee.Id)) select employee).ToList(); } nonManagers1.Dump(); watch1.Stop(); Console.WriteLine("Any: " + watch1.ElapsedMilliseconds); watch1.Restart(); List nonManagers2 = new List(); foreach (var item in Enumerable.Range(1,max)) { nonManagers2 = (from employee in employees join manager in managers on employee.Id equals manager.EmployeeId into tempManagers from manager in tempManagers.DefaultIfEmpty() where manager == null select employee).ToList(); } nonManagers2.Dump(); watch1.Stop(); Console.WriteLine("temp table: " + watch1.ElapsedMilliseconds); watch1.Restart(); List nonManagers3 = new List(); foreach (var item in Enumerable.Range(1,max)) { nonManagers3 = employees.Except(employees.Join(managers, e => e.Id, m => m.EmployeeId, (e, m) => e)).ToList(); } nonManagers3.Dump(); watch1.Stop(); Console.WriteLine("Except: " + watch1.ElapsedMilliseconds); watch1.Restart(); List nonManagers4 = new List(); foreach (var item in Enumerable.Range(1,max)) { HashSet managerIds = new HashSet(managers.Select(x => x.EmployeeId)); nonManagers4 = employees.Where(x => !managerIds.Contains(x.Id)).ToList(); } nonManagers4.Dump(); watch1.Stop(); Console.WriteLine("HashSet: " + watch1.ElapsedMilliseconds); watch1.Restart(); List nonManagers5 = new List(); foreach (var item in Enumerable.Range(1, max)) { bool[] test = new bool[managers.Max(x => x.EmployeeId) + 1]; foreach (var manager in managers) { test[manager.EmployeeId] = true; } nonManagers5 = employees.Where(x => x.Id > test.Length - 1 || !test[x.Id]).ToList(); } nonManagers5.Dump(); watch1.Stop(); Console.WriteLine("Native array call: " + watch1.ElapsedMilliseconds); watch1.Restart(); List nonManagers6 = new List(); foreach (var item in Enumerable.Range(1, max)) { bool[] test = new bool[managers.Max(x => x.EmployeeId) + 1]; foreach (var manager in managers) { test[manager.EmployeeId] = true; } nonManagers6 = employees.Where(x => x.Id > test.Length - 1 || !test[x.Id]).ToList(); } nonManagers6.Dump(); watch1.Stop(); Console.WriteLine("Native array call 2: " + watch1.ElapsedMilliseconds); } public class Employee { public int Id { get; set; } public string Name { get; set; } } public class Manager { public int EmployeeId { get; set; } } 

看看Except()LINQ函数。 它完全符合您的需求。

如果你离开加入项目并使用null条件过滤它会更好

 var finalcertificates = (from globCert in resultCertificate join toExcludeCert in certificatesToExclude on globCert.CertificateId equals toExcludeCert.CertificateId into certs from toExcludeCert in certs.DefaultIfEmpty() where toExcludeCert == null select globCert).Union(currentCertificate).Distinct().OrderBy(cert => cert.CertificateName); 

经理也是员工! 所以Manager类应该是Employee类的子类(或者,如果你不喜欢它,那么它们都应该从父类子类化,或者创建一个NonManager类)。

然后你的问题就像在Employee超类上实现IEquatable接口一样简单(对于GetHashCode只返回EmployeeID )然后使用这段代码:

 var nonManagerEmployees = employeeList.Except(managerList);