如何使用HTML Agility Pack检索网站上的所有图像？

我刚刚下载了HTMLAgilityPack，文档中没有任何示例。

我正在寻找一种从网站下载所有图像的方法。地址字符串，而不是物理图像。

我需要拉出每个img标签的来源。我只是想了解图书馆以及它能提供什么。每个人都说这是这项工作的最佳工具。

编辑

 public void GetAllImages() { WebClient x = new WebClient(); string source = x.DownloadString(@"http://www.google.com"); HtmlAgilityPack.HtmlDocument document = new HtmlAgilityPack.HtmlDocument(); document.Load(source); //I can't use the Descendants method. It doesn't appear. var ImageURLS = document.desc .Select(e => e.GetAttributeValue("src", null)) .Where(s => !String.IsNullOrEmpty(s)); }

您可以使用LINQ执行此操作，如下所示：

 var document = new HtmlWeb().Load(url); var urls = document.DocumentNode.Descendants("img") .Select(e => e.GetAttributeValue("src", null)) .Where(s => !String.IsNullOrEmpty(s));

编辑：此代码现在实际上工作; 我忘了写document.DocumentNode 。

基于他们的一个例子，但是修改了XPath：

  HtmlDocument doc = new HtmlDocument(); List image_links = new List(); doc.Load("file.htm"); foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//img")) { image_links.Add( link.GetAttributeValue("src", "") ); }

我不知道这个扩展，所以我不确定如何将数组写出到其他地方，但这至少可以为您提供数据。（另外，我没有正确定义数组，我很确定。抱歉）。

编辑

使用你的例子：

 public void GetAllImages() { WebClient x = new WebClient(); string source = x.DownloadString(@"http://www.google.com"); HtmlAgilityPack.HtmlDocument document = new HtmlAgilityPack.HtmlDocument(); List image_links = new List(); document.Load(source); foreach(HtmlNode link in document.DocumentElement.SelectNodes("//img")) { image_links.Add( link.GetAttributeValue("src", "") ); } }

如何使用HTML Agility Pack检索网站上的所有图像？

编辑

SOAP使用C＃

创建防火墙规则以c＃编程方式打开每个应用程序的端口

从汇编中读取嵌入文件

对值类型和引用类型使用C＃LINQ表达式

使用静态类/方法依赖项测试类

如何在编译时在.net / C＃应用程序中找到当前时间和日期？

如何将Azure函数的入口点放在.NET DLL中？

如果扩展方法与密封类中的方法具有相同的签名，那么调用优先级是什么？

打印时如何确定字符串的宽度？

绑定datagrid列宽