正确的方法做Parallel.For从Array计算数据

想:总和x和总和x * x。 其中x = line [i]。 因为多个线程想要读/写“sumAll”和“sumAllQ”,我需要锁定它的访问权限。 问题是锁定类型在这里序列化的东西。 我需要在#“Environment.ProcessorCount”中将此操作拆分为循环,每个循环对数组的一部分求和,最后将它们的结果相加。 但是我如何以编程方式进行编程呢?

示例代码:

//line is a float[] Parallel.For(0, line.Length, new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount }, i => { x = (double)line[i]; lock (sumLocker) { sumAll += x; sumAllQ += x * x; } }); 

编辑1:Matthew Watson回答基准测试结果

在家。 CPU Core 2 Quad Q9550 @ 2.83 GHz:

 Result via Linq: SumAll=49999950000, SumAllQ=3,33332833333439E+15 Result via loop: SumAll=49999950000, SumAllQ=3,33332833333439E+15 Result via partition: SumAll=49999950000, SumAllQ=3,333328333335E+15 Via Linq took: 00:00:02.6983044 Via Loop took: 00:00:00.4811901 Via Partition took: 00:00:00.1595113 

工作中。 CPU i7 930 2.8 GHz:

 Result via Linq: SumAll=49999950000, SumAllQ=3,33332833333439E+15 Result via loop: SumAll=49999950000, SumAllQ=3,33332833333439E+15 Result via partition: SumAll=49999950000, SumAllQ=3,333328333335E+15 Via Linq took: 00:00:01.5728736 Via Loop took: 00:00:00.3436929 Via Partition took: 00:00:00.0934209 

正如评论中所建议的那样,您可以使用Aggregate在LINQ中使用AsParallel来完成此AsParallel 。 例如:

 using System.Linq; //A class to hold the results. //This can be improved by making it immutable and using a constructor. public class Result { public double SumAll { get; set; } public double SumAllQ { get; set; } } 

你可以像这样使用LINQ:

 var result = line.AsParallel().Aggregate(new Result(), (input, value) => new Result {SumAll = input.SumAll+value, SumAllQ = input.SumAllQ+value*value}); 

甚至更好:

 var pline = line.AsParallel().WithDegreeOfParallelism(Environment.ProcessorCount); var result = new Result { SumAll = pline.Sum(), SumAllQ = pline.Sum(x => x * x) }; 

AsParallel不允许您直接指定选项,但您可以使用.WithDegreeOfParallelism() .WithExecutionMode().WithMergeOptions()来提供更多控制。 您可能必须使用WithDegreeOfParallelism来使其与多个线程一起运行。

vcjones想知道你是否真的会看到任何加速。 答案是:它可能取决于你拥有多少核心。 PLinq比家用PC(四核)上的普通循环慢。

我想出了一种替代方法,它使用Partitioner将数字列表分成几个部分,这样你就可以分别添加每个部分。 此处还提供了有关使用分区程序的更多信息 。

使用Partitioner方法看起来要快一些,至少在我的家用PC上。

这是我的测试程序。 请注意,您必须任何调试器外部运行此版本的构建,以获得正确的时序。

此代码中的重要方法是ViaPartition()

 Result ViaPartition(double[] numbers) { var result = new Result(); var rangePartitioner = Partitioner.Create(0, numbers.Length); Parallel.ForEach(rangePartitioner, (range, loopState) => { var subtotal = new Result(); for (int i = range.Item1; i < range.Item2; i++) { double n = numbers[i]; subtotal.SumAll += n; subtotal.SumAllQ += n*n; } lock (result) { result.SumAll += subtotal.SumAll; result.SumAllQ += subtotal.SumAllQ; } }); return result; } 

我运行完整测试程序时的结果(如下所示):

 Result via Linq: SumAll=49999950000, SumAllQ=3.33332833333439E+15 Result via loop: SumAll=49999950000, SumAllQ=3.33332833333439E+15 Result via partition: SumAll=49999950000, SumAllQ=3.333328333335E+15 Via Linq took: 00:00:01.1994524 Via Loop took: 00:00:00.2357107 Via Partition took: 00:00:00.0756707 

(注意由于舍入误差导致的细微差别。)

看到其他系统的结果会很有趣。

这是完整的测试程序:

 using System; using System.Collections.Concurrent; using System.Collections.Generic; using System.Diagnostics; using System.Linq; using System.Threading.Tasks; namespace Demo { public class Result { public double SumAll; public double SumAllQ; public override string ToString() { return string.Format("SumAll={0}, SumAllQ={1}", SumAll, SumAllQ); } } class Program { void run() { var numbers = Enumerable.Range(0, 1000000).Select(n => n/10.0).ToArray(); // Prove that the calculation is correct. Console.WriteLine("Result via Linq: " + ViaLinq(numbers)); Console.WriteLine("Result via loop: " + ViaLoop(numbers)); Console.WriteLine("Result via partition: " + ViaPartition(numbers)); int count = 100; TimeViaLinq(numbers, count); TimeViaLoop(numbers, count); TimeViaPartition(numbers, count); } void TimeViaLinq(double[] numbers, int count) { var sw = Stopwatch.StartNew(); for (int i = 0; i < count; ++i) ViaLinq(numbers); Console.WriteLine("Via Linq took: " + sw.Elapsed); } void TimeViaLoop(double[] numbers, int count) { var sw = Stopwatch.StartNew(); for (int i = 0; i < count; ++i) ViaLoop(numbers); Console.WriteLine("Via Loop took: " + sw.Elapsed); } void TimeViaPartition(double[] numbers, int count) { var sw = Stopwatch.StartNew(); for (int i = 0; i < count; ++i) ViaPartition(numbers); Console.WriteLine("Via Partition took: " + sw.Elapsed); } Result ViaLinq(double[] numbers) { return numbers.AsParallel().Aggregate(new Result(), (input, value) => new Result { SumAll = input.SumAll+value, SumAllQ = input.SumAllQ+value*value }); } Result ViaLoop(double[] numbers) { var result = new Result(); for (int i = 0; i < numbers.Length; ++i) { double n = numbers[i]; result.SumAll += n; result.SumAllQ += n*n; } return result; } Result ViaPartition(double[] numbers) { var result = new Result(); var rangePartitioner = Partitioner.Create(0, numbers.Length); Parallel.ForEach(rangePartitioner, (range, loopState) => { var subtotal = new Result(); for (int i = range.Item1; i < range.Item2; i++) { double n = numbers[i]; subtotal.SumAll += n; subtotal.SumAllQ += n*n; } lock (result) { result.SumAll += subtotal.SumAll; result.SumAllQ += subtotal.SumAllQ; } }); return result; } static void Main() { new Program().run(); } } }