简单的accord.net机器学习例子

我是机器学习的新手， accord.net （我代码C＃）。

我想创建一个简单的项目，我在其中查看振荡的简单时间序列数据，然后我希望accord.net学习它并预测下一个值将是什么。

这就是数据（时间序列）应该是这样的：

X – Y.

 1 - 1 2 - 2 3 - 3 4 - 2 5 - 1 6 - 2 7 - 3 8 - 2 9 - 1

然后我希望它预测以下内容：

X – Y.

 10 - 2 11 - 3 12 - 2 13 - 1 14 - 2 15 - 3

你能帮我解决一些如何解决它的例子吗？

一种简单的方法是使用Accord ID3决策树。

诀窍是找出要使用的输入 – 你不能只在X上训练 – 树不会从中学到任何关于X的未来值 – 但是你可以构建一些从X派生的特征（或Y的先前值））这将是有用的。

通常对于这样的问题 – 你会根据从Y的先前值（被预测的东西）而不是X得到的特征进行每个预测。但是这假设你可以在每个预测之间顺序观察Y（你不能再预测任何仲裁X）所以我会坚持提出的问题。

我开始构建一个Accord ID3决策树来解决下面这个问题。我使用了几个不同的x % n值作为特征 – 希望树可以解决这个问题。事实上，如果我添加(x-1) % 4作为一个特征，它可以在一个级别中只使用该属性来实现 – 但我想更重要的是让树找到模式。

以下是代码：

  // this is the sequence y follows int[] ysequence = new int[] { 1, 2, 3, 2 }; // this generates the correct Y for a given X int CalcY(int x) => ysequence[(x - 1) % 4]; // this generates some inputs - just a few differnt mod of x int[] CalcInputs(int x) => new int[] { x % 2, x % 3, x % 4, x % 5, x % 6 }; // for http://stackoverflow.com/questions/40573388/simple-accord-net-machine-learning-example [TestMethod] public void AccordID3TestStackOverFlowQuestion2() { // build the training data set int numtrainingcases = 12; int[][] inputs = new int[numtrainingcases][]; int[] outputs = new int[numtrainingcases]; Console.WriteLine("\t\t\t\tx \ty"); for (int x = 1; x <= numtrainingcases; x++) { int y = CalcY(x); inputs[x-1] = CalcInputs(x); outputs[x-1] = y; Console.WriteLine("TrainingData \t " +x+"\t "+y); } // define how many values each input can have DecisionVariable[] attributes = { new DecisionVariable("Mod2",2), new DecisionVariable("Mod3",3), new DecisionVariable("Mod4",4), new DecisionVariable("Mod5",5), new DecisionVariable("Mod6",6) }; // define how many outputs (+1 only because y doesn't use zero) int classCount = outputs.Max()+1; // create the tree DecisionTree tree = new DecisionTree(attributes, classCount); // Create a new instance of the ID3 algorithm ID3Learning id3learning = new ID3Learning(tree); // Learn the training instances! Populates the tree id3learning.Learn(inputs, outputs); Console.WriteLine(); // now try to predict some cases that werent in the training data for (int x = numtrainingcases+1; x <= 2* numtrainingcases; x++) { int[] query = CalcInputs(x); int answer = tree.Decide(query); // makes the prediction Assert.AreEqual(CalcY(x), answer); // check the answer is what we expected - ie the tree got it right Console.WriteLine("Prediction \t\t " + x+"\t "+answer); } }

这是它产生的输出：

  xy TrainingData 1 1 TrainingData 2 2 TrainingData 3 3 TrainingData 4 2 TrainingData 5 1 TrainingData 6 2 TrainingData 7 3 TrainingData 8 2 TrainingData 9 1 TrainingData 10 2 TrainingData 11 3 TrainingData 12 2 Prediction 13 1 Prediction 14 2 Prediction 15 3 Prediction 16 2 Prediction 17 1 Prediction 18 2 Prediction 19 3 Prediction 20 2 Prediction 21 1 Prediction 22 2 Prediction 23 3 Prediction 24 2

希望有所帮助。

编辑：在评论之后，在示例下面修改以训练目标（Y）的先前值 - 而不是从时间索引（X）导出的特征。这意味着你无法在系列开始时开始训练 - 因为你需要Y先前值的回溯历史。在这个例子中，我开始于x = 9，因为它保持相同的序列。

  // this is the sequence y follows int[] ysequence = new int[] { 1, 2, 3, 2 }; // this generates the correct Y for a given X int CalcY(int x) => ysequence[(x - 1) % 4]; // this generates some inputs - just a few differnt mod of x int[] CalcInputs(int x) => new int[] { CalcY(x-1), CalcY(x-2), CalcY(x-3), CalcY(x-4), CalcY(x - 5) }; //int[] CalcInputs(int x) => new int[] { x % 2, x % 3, x % 4, x % 5, x % 6 }; // for http://stackoverflow.com/questions/40573388/simple-accord-net-machine-learning-example [TestMethod] public void AccordID3TestTestStackOverFlowQuestion2() { // build the training data set int numtrainingcases = 12; int starttrainingat = 9; int[][] inputs = new int[numtrainingcases][]; int[] outputs = new int[numtrainingcases]; Console.WriteLine("\t\t\t\tx \ty"); for (int x = starttrainingat; x < numtrainingcases + starttrainingat; x++) { int y = CalcY(x); inputs[x- starttrainingat] = CalcInputs(x); outputs[x- starttrainingat] = y; Console.WriteLine("TrainingData \t " +x+"\t "+y); } // define how many values each input can have DecisionVariable[] attributes = { new DecisionVariable("y-1",4), new DecisionVariable("y-2",4), new DecisionVariable("y-3",4), new DecisionVariable("y-4",4), new DecisionVariable("y-5",4) }; // define how many outputs (+1 only because y doesn't use zero) int classCount = outputs.Max()+1; // create the tree DecisionTree tree = new DecisionTree(attributes, classCount); // Create a new instance of the ID3 algorithm ID3Learning id3learning = new ID3Learning(tree); // Learn the training instances! Populates the tree id3learning.Learn(inputs, outputs); Console.WriteLine(); // now try to predict some cases that werent in the training data for (int x = starttrainingat+numtrainingcases; x <= starttrainingat + 2 * numtrainingcases; x++) { int[] query = CalcInputs(x); int answer = tree.Decide(query); // makes the prediction Assert.AreEqual(CalcY(x), answer); // check the answer is what we expected - ie the tree got it right Console.WriteLine("Prediction \t\t " + x+"\t "+answer); } }

您还可以考虑对Y的先前值之间的差异进行训练 - 如果Y的绝对值不如相对变化那么重要，那么这将更好地工作。

简单的accord.net机器学习例子

在没有PCA的多类svm中找到正确的function

具有动态时间扭曲内核的SVM返回错误率大于0

用于多维解决方案优化/预测的AI算法

使用C＃和“Accord.NET”进行非线性支持向量回归