比较两个excel文件的差异

我想比较两个输入csv文件,看看是否有添加或删除的行。 什么是最好的方法来解决这个问题。 我没有使用列名,因为列的名称对于所有文件都不一致。

private void compare_btn_Click(object sender, EventArgs e) { string firstFile = firstExcel_txt.Text; var results = ReadExcel(openFileDialog1); string secondFile = secondExcel_txt.Text; var results2 = ReadExcel(openFileDialog2); } 

读:

 public object ReadExcel(OpenFileDialog openFileDialog) { var _excelFile = new ExcelQueryFactory(openFileDialog.FileName); var _info = from c in _excelFile.WorksheetNoHeader() select c; string header1, header2, header3; foreach (var item in _info) { header1 = item.ElementAt(0); header2 = item.ElementAt(1); header3 = item.ElementAt(2); } return _info; } 

任何有关我如何做到这一点的帮助都会很棒。

我建议你 excel文件的每一行计算一个哈希值 ,然后你可以继续比较每一行的哈希值,看它是否匹配另一个文件的任何哈希值(参见源代码中的注释)

我还提供了一些类来存储Excel文件的内容

 using System.Security.Cryptography; private void compare_btn_Click(object sender, EventArgs e) { string firstFile = firstExcel_txt.Text; ExcelInfo file1 = ReadExcel(openFileDialog1); string secondFile = secondExcel_txt.Text; ExcelInfo file2 = ReadExcel(openFileDialog2); CompareExcels(file1,file2) ; } public void CompareExcels(ExcelInfo fileA, ExcelInfo fileB) { foreach(ExcelRow rowA in fileA.excelRows) { //If the current hash of a row of fileA does not exists in fileB then it was removed if(! fileB.ContainsHash(rowA.hash)) { Console.WriteLine("Row removed" + rowA.ToString()); } } foreach(ExcelRow rowB in fileB.excelRows) { //If the current hash of a row of fileB does not exists in fileA then it was added if(! fileA.ContainsHash(rowB.hash)) { Console.WriteLine("Row added" + rowB.ToString()); } } } public Class ExcelRow { public List lstCells ; public byte[] hash public ExcelRow() { lstCells = new List() ; } public override string ToString() { string resp ; resp = string.Empty ; foreach(string cellText in lstCells) { if(resp != string.Empty) { resp = resp + "," + cellText ; } else { resp = cellText ; } } return resp ; } public void CalculateHash() { byte[] rowBytes ; byte[] cellBytes ; int pos ; int numRowBytes ; //Determine how much bytes are required to store a single excel row numRowBytes = 0 ; foreach(string cellText in lstCells) { numRowBytes += NumBytes(cellText) ; } //Allocate space to calculate the HASH of a single row rowBytes= new byte[numRowBytes] pos = 0 ; //Concatenate the cellText of each cell, converted to bytes,into a single byte array foreach(string cellText in lstCells) { cellBytes = GetBytes(cellText) ; System.Buffer.BlockCopy(cellBytes, 0, rowBytes, pos, cellBytes.Length); pos = cellBytes.Length ; } hash = new MD5CryptoServiceProvider().ComputeHash(rowBytes); } static int NumBytes(string str) { return str.Length * sizeof(char); } static byte[] GetBytes(string str) { byte[] bytes = new byte[NumBytes(str)]; System.Buffer.BlockCopy(str.ToCharArray(), 0, bytes, 0, bytes.Length); return bytes; } } public Class ExcelInfo { public List excelRows ; public ExcelInfo() { excelRows = new List(); } public bool ContainsHash(byte[] hashToLook) { bool found ; found = false ; foreach(ExcelRow eRow in excelRows) { found = EqualHash(eRow.hash, hashToLook) ; if(found) { break ; } } return found ; } public static EqualHash(byte[] hashA, byte[] hashB) { bool bEqual ; int i ; bEqual = false; if (hashA.Length == hashB.Length) { i = 0; while ((i < hashA.Length) && (hashA[i] == hashB[i])) { i++ ; } if (i == hashA.Length) { bEqual = true; } } return bEqual ; } } public ExcelInfo ReadExcel(OpenFileDialog openFileDialog) { var _excelFile = new ExcelQueryFactory(openFileDialog.FileName); var _info = from c in _excelFile.WorksheetNoHeader() select c; ExcelRow excelRow ; ExcelInfo resp ; resp = new ExcelInfo() ; foreach (var item in _info) { excelRow = new ExcelRow() ; //Add all the cells (with a for each) excelRow.lstCells.Add(item.ElementAt(0)); excelRow.lstCells.Add(item.ElementAt(1)); .... //Add the last cell of the row excelRow.lstCells.Add(item.ElementAt(N)); //Calculate the hash of the row excelRow.CalculateHash() ; //Add the row to the ExcelInfo object resp.excelRows.Add(excelRow) ; } return resp ; } 

最准确的方法是将它们转换为字节数组 ,检查两者转换为数组时的差异,使用以下链接获取有关如何将Excel工作表转换为字节数组的简单示例

将Excel转换为字节[]

现在您已将两个Excel工作表转换为byte [],您应该通过检查字节数组是否相等来检查它们是否存在差异。

可以通过以下几种方式使用linq

 using System.Linq; //SequenceEqual byte[] FirstExcelFileBytes = null; byte[] SecondExcelFileBytes = null; FirstExcelFileBytes = GetFirstExcelFile(); SecondExcelFileBytes = GetSecondExcelFile(); if (FirstExcelFileBytes.SequenceEqual(SecondExcelFileBytes) == true) { MessageBox.Show("Arrays are equal"); } else { MessageBox.Show("Arrays don't match"); } 

有足够的其他方法来查找比较字节数组 ,您应该做一些最适合您的研究。

使用以下链接检查Row added row removed等内容。

比较excelsheets