在C#中解析HTML表

我有一个包含表格的html页面,我想用C#窗体表格解析该表格

http://www.mufap.com.pk/payout-report.php?tab=01

这是我想要解析的网页我试过了

> Foreach(Htmlnode a in document.getelementbyname("tr")) { richtextbox1.text=a.innertext; } 

我尝试过这样的事情,但它不会以表格forms给我,因为我只是打印所有trs所以请帮助我关于这个thanx抱歉我的英语。

使用Html Agility Pack

 WebClient webClient = new WebClient(); string page = webClient.DownloadString("http://www.mufap.com.pk/payout-report.php?tab=01"); HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument(); doc.LoadHtml(page); List> table = doc.DocumentNode.SelectSingleNode("//table[@class='mydata']") .Descendants("tr") .Skip(1) .Where(tr=>tr.Elements("td").Count()>1) .Select(tr => tr.Elements("td").Select(td => td.InnerText.Trim()).ToList()) .ToList(); 

你的意思是这样的吗?

 foreach (HtmlNode table in doc.DocumentNode.SelectNodes("//table")) { ///This is the table. foreach (HtmlNode row in table.SelectNodes("tr")) { ///This is the row. foreach (HtmlNode cell in row.SelectNodes("th|td")) { ///This the cell. } } } 

在这之后,但使用普通的C#代码执行您所要求的方法可能如下

 ///  /// parses a table and returns a list containing all the data with columns separated by tabs /// eg: records = getTable(doc, 0); ///  /// HtmlDocument to work with /// table index (base 0) /// list containing the table data public List getTableData(HtmlDocument doc, int number) { HtmlElementCollection tables = doc.GetElementsByTagName("table"); int idx=0; List data = new List(); foreach (HtmlElement tbl in tables) { if (idx++ == number) { data = getTableData(tbl); break; } } return data; } ///  /// parses a table and returns a list containing all the data with columns separated by tabs /// eg: records = getTable(getElement(doc, "table", "id", "table1")); ///  /// HtmlElement table to work with /// list containing the table data public List getTableData(HtmlElement tbl) { int nrec = 0; List data = new List(); string rowBuff; HtmlElementCollection rows = tbl.GetElementsByTagName("tr"); HtmlElementCollection cols; foreach (HtmlElement tr in rows) { cols = tr.GetElementsByTagName("td"); nrec++; rowBuff = nrec.ToString(); foreach (HtmlElement td in cols) { rowBuff += "\t" + WebUtility.HtmlDecode(td.InnerText); } data.Add(rowBuff); } return data; } 

上面的内容将允许您通过使用页面内的表“index”(对未命名的表有用)或通过将“table”HtmlElement传递给函数来提取表中的数据(更快但仅对命名表有用); 请注意,我选择返回“List”作为结果,并使用制表符分隔各列数据; 您可以轻松更改代码,以您喜欢的任何其他格式返回数据