通过增加索引总和来生成排序的有效方法

对于启发式算法，我需要一个接一个地评估某个集合的组合，直到达到停止标准。

由于它们很多，目前我使用以下内存高效迭代器块生成它们（受python的itertools.combinations启发）：

 public static IEnumerable GetCombinations(this IList pool, int r) { int n = pool.Count; if (r > n) throw new ArgumentException("r cannot be greater than pool size"); int[] indices = Enumerable.Range(0, r).ToArray(); yield return indices.Select(idx => pool[idx]).ToArray(); while (true) { int i; for (i = r - 1; i >= 0; i--) if (indices[i] != i + n - r) break; if (i < 0) break; indices[i] += 1; for (int j = i + 1; j  pool[idx]).ToArray(); } }

问题是，为了大大提高启发式的效率，我需要生成这些组合，这些组合按索引的总和排序（换句话说，我需要首先生成，包含集合的第一个元素的组合）。

例如
考虑集合S = {0,1,2,3,4,5}
（为简单起见，我选择此集合，因为元素及其索引重合）。
从给定算法生成的r=4个数的所有可能组合是：

 (0, 1, 2, 3) SUM: 6 (0, 1, 2, 4) SUM: 7 (0, 1, 2, 5) SUM: 8 (0, 1, 3, 4) SUM: 8 (0, 1, 3, 5) SUM: 9 (0, 1, 4, 5) SUM: 10 (0, 2, 3, 4) SUM: 9 (0, 2, 3, 5) SUM: 10 (0, 2, 4, 5) SUM: 11 (0, 3, 4, 5) SUM: 12 (1, 2, 3, 4) SUM: 10 (1, 2, 3, 5) SUM: 11 (1, 2, 4, 5) SUM: 12 (1, 3, 4, 5) SUM: 13 (2, 3, 4, 5) SUM: 14

正如您所看到的，组合不是严格按升序排序。

期望的结果是以下：
（具有相同总和的组合的顺序并不重要）

 (0, 1, 2, 3) SUM: 6 (0, 1, 2, 4) SUM: 7 (0, 1, 2, 5) SUM: 8 (0, 1, 3, 4) SUM: 8 (0, 1, 3, 5) SUM: 9 (0, 2, 3, 4) SUM: 9 (0, 1, 4, 5) SUM: 10 (0, 2, 3, 5) SUM: 10 (1, 2, 3, 4) SUM: 10 (0, 2, 4, 5) SUM: 11 (1, 2, 3, 5) SUM: 11 (0, 3, 4, 5) SUM: 12 (1, 2, 4, 5) SUM: 12 (1, 3, 4, 5) SUM: 13 (2, 3, 4, 5) SUM: 14

一个简单的解决方案是生成所有组合，然后根据它们的总和对它们进行排序; 但这并不是真正有效/可行，因为随着n增长，组合的数量变得很大。

我也快速了解组合格雷码，但我找不到任何适合这个问题的人。

你对如何实现这样的事情有所了解吗？

编辑：

这个问题有另一种（不幸的是不容易）的表述。
给定集合S和数字r ，所有可能的总和都是微不足道的，因为它们只是从S的前r元素之和到S的最后r元素之和的所有数字。

话虽这么说，如果，对于每个和T我们可以有效地找到所有具有和T的组合，我们解决了原始问题，因为我们只是按升序生成它们。

¹有效意味着我不想生成所有组合并丢弃具有不同总和的组合。

编辑2：

在@EricLippert建议之后，我创建了以下代码：

 public static IEnumerable GetCombinationsSortedByIndexSum(this IList pool, int r) { int n = pool.Count; if (r > n) throw new ArgumentException("r cannot be greater than pool size"); int minSum = ((r - 1) * r) / 2; int maxSum = (n * (n + 1)) / 2 - ((n - r - 1) * (n - r)) / 2; for (int sum = minSum; sum  pool[x]).ToArray(); } } static IEnumerable<IEnumerable> AllMonotIncrSubseqOfLenMWhichSumToN(int seqFirstElement, int seqLastElement, int m, int n) { for (int i = seqFirstElement; i <= seqLastElement - m + 1; i++) { if (m == 1) { if (i == n) yield return new int[] { i }; } else { foreach (var el in AllMonotIncrSubseqOfLenMWhichSumToN(i + 1, seqLastElement, m - 1, n - i)) yield return new int[] { i }.Concat(el); } } }

这很好用（希望是Eric的意思：P）但是我仍然担心递归方法的复杂性。实际上，我们似乎正在为每个总和重新生成所有组合，而不是总结到所需值的那些组合。

为了降低内部函数的复杂性，我找到了一种通过使用有效的上限和下限来限制迭代的方法（现在很难说这是什么复杂性）。

检查我的答案，看看最终的代码。

我想到的解决方案是：

 using System; using System.Collections.Generic; using System.Linq; class Program { // Preconditions: // * items is a sequence of non-negative monotone increasing integers // * n is the number of items to be in the subsequence // * sum is the desired sum of that subsequence. // Result: // A sequence of subsequences of the original sequence where each // subsequence has n items and the given sum. static IEnumerable> M(IEnumerable items, int sum, int n) { // Let's start by taking some easy outs. If the sum is negative // then there is no solution. If the number of items in the // subsequence is negative then there is no solution. if (sum < 0 || n < 0) yield break; // If the number of items in the subsequence is zero then // the only possible solution is if the sum is zero. if (n == 0) { if (sum == 0) yield return Enumerable.Empty(); yield break; } // If the number of items is less than the required number of // items, there is no solution. if (items.Count() < n) yield break; // We have at least n items in the sequence, and // and n is greater than zero, so First() is valid: int first = items.First(); // We need n items from a monotone increasing subsequence // that have a particular sum. We might already be too // large to meet that requirement: if (n * first > sum) yield break; // There might be some solutions that involve the first element. // Find them all. foreach(var subsequence in M(items.Skip(1), sum - first, n - 1)) yield return new[]{first}.Concat(subsequence); // And there might be some solutions that do not involve the first element. // Find them all. foreach(var subsequence in M(items.Skip(1), sum, n)) yield return subsequence; } static void Main() { int[] x = {0, 1, 2, 3, 4, 5}; for (int i = 0; i <= 15; ++i) foreach(var seq in M(x, i, 4)) Console.WriteLine("({0}) SUM {1}", string.Join(",", seq), i); } }

输出是您想要的输出。

我没有尝试优化这个。分析它并查看大部分时间花在哪里会很有趣。

更新：为了好玩，我写了一个使用不可变堆栈而不是任意可枚举的版本。请享用！

 using System; using System.Collections.Generic; using System.Linq; abstract class ImmutableList : IEnumerable { public static readonly ImmutableList Empty = new EmptyList(); private ImmutableList() {} public abstract bool IsEmpty { get; } public abstract T Head { get; } public abstract ImmutableList Tail { get; } public ImmutableList Push(T newHead) { return new List(newHead, this); } private sealed class EmptyList : ImmutableList { public override bool IsEmpty { get { return true; } } public override T Head { get { throw new InvalidOperationException(); } } public override ImmutableList Tail { get { throw new InvalidOperationException(); } } } private sealed class List : ImmutableList { private readonly T head; private readonly ImmutableList tail; public override bool IsEmpty { get { return false; } } public override T Head { get { return head; } } public override ImmutableList Tail { get { return tail; } } public List(T head, ImmutableList tail) { this.head = head; this.tail = tail; } } System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator() { return this.GetEnumerator(); } public IEnumerator GetEnumerator() { for (ImmutableList current = this; !current.IsEmpty; current = current.Tail) yield return current.Head; } } class Program { // Preconditions: // * items is a sequence of non-negative monotone increasing integers // * n is the number of items to be in the subsequence // * sum is the desired sum of that subsequence. // Result: // A sequence of subsequences of the original sequence where each // subsequence has n items and the given sum. static IEnumerable> M(ImmutableList items, int sum, int n) { // Let's start by taking some easy outs. If the sum is negative // then there is no solution. If the number of items in the // subsequence is negative then there is no solution. if (sum < 0 || n < 0) yield break; // If the number of items in the subsequence is zero then // the only possible solution is if the sum is zero. if (n == 0) { if (sum == 0) yield return ImmutableList.Empty; yield break; } // If the number of items is less than the required number of // items, there is no solution. if (items.Count() < n) yield break; // We have at least n items in the sequence, and // and n is greater than zero. int first = items.Head; // We need n items from a monotone increasing subsequence // that have a particular sum. We might already be too // large to meet that requirement: if (n * first > sum) yield break; // There might be some solutions that involve the first element. // Find them all. foreach(var subsequence in M(items.Tail, sum - first, n - 1)) yield return subsequence.Push(first); // And there might be some solutions that do not involve the first element. // Find them all. foreach(var subsequence in M(items.Tail, sum, n)) yield return subsequence; } static void Main() { ImmutableList x = ImmutableList.Empty.Push(5). Push(4).Push(3).Push(2).Push(1).Push(0); for (int i = 0; i <= 15; ++i) foreach(var seq in M(x, i, 4)) Console.WriteLine("({0}) SUM {1}", string.Join(",", seq), i); } }

如果你看的最糟糕的情况是35选择10，根据这个二项式系数计算器，这将产生183,579,396个独特的组合，这是迄今为止我在网上找到的最好的免费组合。大多数现代CPU应该能够在最多一秒钟或2秒内完成此任务 – 取决于语言而不计算排序时间。使用C ++，它可能会在一秒钟内完成。如果进入C ++路由，那么您可能希望将其设为dll并通过平台调用（P / I）调用它。还有一些类型具有优越性能的列表，这些列表大多是排序的，这看起来像这里的情况。

如果在一秒钟内仍然太慢，你可以考虑预先计算你需要的所有N个选择K个案例并将它们写出一个文件（在根据k索引的总和应用排序后）然后阅读程序启动时的文件。根据应用程序及其托管位置，如果它适用于内存有限的Windows CE平台，则可能不太实用。但是，对于具有大量硬盘空间的PC或其他系统，它应该不是问题。

通过“指定集合中的索引”来回答您关于我的意思的问题：

我编写了一个C＃类，它可以将索引放入已排序的二项系数表中，并返回该索引的相应k索引，而不必遍历它之前的所有组合。还有另一种方法可以执行相反的操作并返回给定k索引的相应索引（或排名）。等级从零开始，在上面的示例中，将指定0，1,2,3的k索引。等级1将用于k索引0,1,2,4等。所以，例如在35选择10的情况下，如果你知道你需要超过150,000,000的所有k索引，那么你不需要迭代前150M以获得之后的值。您可以调用类方法并传递150000000作为索引，它将返回该索引的k索引。这些方法是高度优化的，并且基于Pascal三角形中可以看到的数学关系。

该类是用.NET C＃编写的，它提供了一种通过使用通用列表来管理与问题相关的对象（如果有）的方法。此类的构造函数采用名为InitTable的bool值，当为true时，将创建一个通用列表来保存要管理的对象。如果此值为false，则不会创建表。无需创建表格即可使用上面列出的翻译方法。提供访问器方法来访问该表。

有一个关联的测试类，它显示了如何使用该类及其方法。它已经过至少2个案例的广泛测试，并且没有已知的错误。

要阅读有关此类并下载代码的信息，请参阅Tablizing The Binomial Coeffieicent 。

以下测试代码将遍历每个唯一组合：

 public void Test10Choose5() { String S; int Loop; int N = 10; // Total number of elements in the set. int K = 5; // Total number of elements in each group. // Create the bin coeff object required to get all // the combos for this N choose K combination. BinCoeff BC = new BinCoeff(N, K, false); int NumCombos = BinCoeff.GetBinCoeff(N, K); // The Kindexes array specifies the indexes for a lexigraphic element. int[] KIndexes = new int[K]; StringBuilder SB = new StringBuilder(); // Loop thru all the combinations for this N choose K case. for (int Combo = 0; Combo < NumCombos; Combo++) { // Get the k-indexes for this combination. BC.GetKIndexes(Combo, KIndexes); // Verify that the Kindexes returned can be used to retrive the // rank or lexigraphic order of the KIndexes in the table. int Val = BC.GetIndex(true, KIndexes); if (Val != Combo) { S = "Val of " + Val.ToString() + " != Combo Value of " + Combo.ToString(); Console.WriteLine(S); } SB.Remove(0, SB.Length); for (Loop = 0; Loop < K; Loop++) { SB.Append(KIndexes[Loop].ToString()); if (Loop < K - 1) SB.Append(" "); } S = "KIndexes = " + SB.ToString(); Console.WriteLine(S); } }

确保使用类的版本的GetBinCoeff，它实现了计算组合数的Mark Dominus版本。它使用长值，代码不太可能溢出。

为了完整和清晰起见，我将发布我的最终代码：

 // Given a pool of elements returns all the // combinations of the groups of lenght r in pool, // such that the combinations are ordered (ascending) by the sum of // the indexes of the elements. // eg pool = {A,B,C,D,E} r = 3 // returns // (A, B, C) indexes: (0, 1, 2) sum: 3 // (A, B, D) indexes: (0, 1, 3) sum: 4 // (A, B, E) indexes: (0, 1, 4) sum: 5 // (A, C, D) indexes: (0, 2, 3) sum: 5 // (A, C, E) indexes: (0, 2, 4) sum: 6 // (B, C, D) indexes: (1, 2, 3) sum: 6 // (A, D, E) indexes: (0, 3, 4) sum: 7 // (B, C, E) indexes: (1, 2, 4) sum: 7 // (B, D, E) indexes: (1, 3, 4) sum: 8 // (C, D, E) indexes: (2, 3, 4) sum: 9 public static IEnumerable GetCombinationsSortedByIndexSum(this IList pool, int r) { int n = pool.Count; if (r > n) throw new ArgumentException("r cannot be greater than pool size"); int minSum = F(r - 1); int maxSum = F(n) - F(n - r - 1); for (int sum = minSum; sum <= maxSum; sum++) { foreach (var indexes in AllSubSequencesWithGivenSum(0, n - 1, r, sum)) yield return indexes.Select(x => pool[x]).ToArray(); } } // Given a start element and a last element of a sequence of consecutive integers // returns all the monotonically increasing subsequences of length "m" having sum "sum" // eg seqFirstElement = 1, seqLastElement = 5, m = 3, sum = 8 // returns {1,2,5} and {1,3,4} static IEnumerable> AllSubSequencesWithGivenSum(int seqFirstElement, int seqLastElement, int m, int sum) { int lb = sum - F(seqLastElement) + F(seqLastElement - m + 1); int ub = sum - F(seqFirstElement + m - 1) + F(seqFirstElement); lb = Math.Max(seqFirstElement, lb); ub = Math.Min(seqLastElement - m + 1, ub); for (int i = lb; i <= ub; i++) { if (m == 1) { if (i == sum) // this check shouldn't be necessary anymore since LB/UB should automatically exclude wrong solutions yield return new int[] { i }; } else { foreach (var el in AllSubSequencesWithGivenSum(i + 1, seqLastElement, m - 1, sum - i)) yield return new int[] { i }.Concat(el); } } } // Formula to compute the sum of the numbers from 0 to n // eg F(4) = 0 + 1 + 2 + 3 + 4 = 10 static int F(int n) { return (n * (n + 1)) / 2; }

通过增加索引总和来生成排序的有效方法

ResX文件和额外的应用程序层

c＃中的打印增量为0.1

C＃：从bson文档中检索数组值

生成和保存ZedGraph图而不在表单上显示

UpdatePanel中ListView中的LinkButton会导致完全回发

migratordotnet – 从应用程序内部运行迁移（无需或构建）

如何停止文本框离开事件触发窗体关闭

从另一个文件加载App.Config的部分

如何使用Selenium和C＃获取文本输入的值？

有没有办法在没有数据源的情况下创建LINQ查询作为变量？

通过增加索引总和来生成排序的有效方法

ResX文件和额外的应用程序层

c＃中的打印增量为0.1

C＃：从bson文档中检索数组值

生成和保存ZedGraph图而不在表单上显示

UpdatePanel中ListView中的LinkBut​​ton会导致完全回发

migratordotnet – 从应用程序内部运行迁移（无需或构建）

如何停止文本框离开事件触发窗体关闭

从另一个文件加载App.Config的部分

如何使用Selenium和C＃获取文本输入的值？

有没有办法在没有数据源的情况下创建LINQ查询作为变量？

UpdatePanel中ListView中的LinkButton会导致完全回发