如何解析c#中的html文本
我有一个像这样的html表达式:
"This is Some
Text" + Environment.NewLine + "This is some more text
我只想提取文本。 所以结果应该是
"This is Some Text" + Environment.NewLine + "This is some more text"
我该怎么做呢?
使用HtmlAgilityPack
string html = @"This is Some
Text" + Environment.NewLine + "This is some more text
"; HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument(); doc.LoadHtml(html); var str = doc.DocumentNode.InnerText;
使用正则表达式简单: Regex.Replace(source, "<.*?>", string.Empty);