如何使用HTMLAgilityPack选择HtmlNodeType.Comment的节点类型

我想从html中删除像

  

如何使用HTMLAgilityPack在C#中执行此操作?

我正在使用

 static void RemoveTag(HtmlNode node, string tag) { var nodeCollection = node.SelectNodes("//"+ tag ); if(nodeCollection!=null) foreach (HtmlNode nodeTag in nodeCollection) { nodeTag.Remove(); } } 

对于普通标签。

  public static void RemoveComments(HtmlNode node) { foreach (var n in node.ChildNodes.ToArray()) RemoveComments(n); if (node.NodeType == HtmlNodeType.Comment) node.Remove(); } static void Main(string[] args) { var doc = new HtmlDocument(); string html = @"       "; doc.LoadHtml(html); RemoveComments(doc.DocumentNode); Console.WriteLine(doc.DocumentNode.OuterHtml); Console.ReadLine(); } 

或者一个有趣的小LINQ风格:

 public static IEnumerable Walk(HtmlNode node) { yield return node; foreach (var child in node.ChildNodes) foreach (var x in Walk(child)) yield return x; } ... foreach (var n in Walk(doc.DocumentNode).OfType().ToArray()) n.Remove(); 

更容易(忘了我们可以用xpath来查找注释节点)

  var doc = new HtmlDocument(); string html = @"        "; doc.LoadHtml(html); foreach (var n in doc.DocumentNode.SelectNodes("//comment()") ?? new HtmlNodeCollection(doc.DocumentNode)) n.Remove(); 

@Mark,结合你的第三个例子来制作这个,供参考:

 public static string CleanUpRteOutput(this string s) { if (s != null) { HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument(); doc.LoadHtml(s); RemoveTag(doc, "script"); RemoveTag(doc, "link"); RemoveTag(doc, "style"); RemoveTag(doc, "meta"); RemoveTag(doc, "comment"); ... 

和removeTag函数:

 static void RemoveTag(HtmlAgilityPack.HtmlDocument doc, string tag) { foreach (var n in doc.DocumentNode.SelectNodes("//" + tag) ?? new HtmlAgilityPack.HtmlNodeCollection(doc.DocumentNode)) n.Remove(); }