从WebBrowser控件获取ReadyState而不使用DoEvents

这已经在这里以及其他网站和它的工作中多次被忽略了,但是我希望通过其他方式获得想法:

使用导航或发布后获取ReadyState = Complete,而不使用DoEvents因为它的所有缺点。

我还要注意,使用DocumentComplete事件在这里没有帮助,因为我不会只在一个页面上导航,而是像这样一个接一个地导航。

wb.navigate("www.microsoft.com") //dont use DoEvents loop here wb.Document.Body.SetAttribute(textbox1, "login") //dont use DoEvents loop here if (wb.documenttext.contais("text")) //do something 

今天的方式是使用DoEvents。 我想知道是否有人有正确的方法等待浏览器方法的异步调用,然后继续其余的逻辑。 只是为了它。

提前致谢。

下面是一个基本的WinForms应用程序代码,说明如何使用async/await DocumentCompleted事件。 它一个接一个地导航到多个页面。 一切都在主UI线程上发生。

它可能是模拟表单按钮单击,而不是调用this.webBrowser.Navigate(url) ,以触发POST样式的导航。

webBrowser.IsBusy异步循环逻辑是可选的,其目的是(非确定地)考虑页面的动态AJAX代码,该代码可能发生在window.onload事件之后。

 using System; using System.Diagnostics; using System.Threading; using System.Threading.Tasks; using System.Windows.Forms; namespace WebBrowserApp { public partial class MainForm : Form { WebBrowser webBrowser; public MainForm() { InitializeComponent(); // create a WebBrowser this.webBrowser = new WebBrowser(); this.webBrowser.Dock = DockStyle.Fill; this.Controls.Add(this.webBrowser); this.Load += MainForm_Load; } // Form Load event handler async void MainForm_Load(object sender, EventArgs e) { // cancel the whole operation in 30 sec var cts = new CancellationTokenSource(30000); var urls = new String[] { "http://www.example.com", "http://www.gnu.org", "http://www.debian.org" }; await NavigateInLoopAsync(urls, cts.Token); } // navigate to each URL in a loop async Task NavigateInLoopAsync(string[] urls, CancellationToken ct) { foreach (var url in urls) { ct.ThrowIfCancellationRequested(); var html = await NavigateAsync(ct, () => this.webBrowser.Navigate(url)); Debug.Print("url: {0}, html: \n{1}", url, html); } } // asynchronous navigation async Task NavigateAsync(CancellationToken ct, Action startNavigation) { var onloadTcs = new TaskCompletionSource(); EventHandler onloadEventHandler = null; WebBrowserDocumentCompletedEventHandler documentCompletedHandler = delegate { // DocumentCompleted may be called several time for the same page, // if the page has frames if (onloadEventHandler != null) return; // so, observe DOM onload event to make sure the document is fully loaded onloadEventHandler = (s, e) => onloadTcs.TrySetResult(true); this.webBrowser.Document.Window.AttachEventHandler("onload", onloadEventHandler); }; this.webBrowser.DocumentCompleted += documentCompletedHandler; try { using (ct.Register(() => onloadTcs.TrySetCanceled(), useSynchronizationContext: true)) { startNavigation(); // wait for DOM onload event, throw if cancelled await onloadTcs.Task; } } finally { this.webBrowser.DocumentCompleted -= documentCompletedHandler; if (onloadEventHandler != null) this.webBrowser.Document.Window.DetachEventHandler("onload", onloadEventHandler); } // the page has fully loaded by now // optional: let the page run its dynamic AJAX code, // we might add another timeout for this loop do { await Task.Delay(500, ct); } while (this.webBrowser.IsBusy); // return the page's HTML content return this.webBrowser.Document.GetElementsByTagName("html")[0].OuterHtml; } } } 

如果你想从控制台应用程序做类似的事情,这里有一个例子 。

解决方案很简单:

  // MAKE SURE ReadyState = Complete while (WebBrowser1.ReadyState.ToString() != "Complete") { Application.DoEvents(); } 

//转到您的子序列代码……


肮脏而快速..我是VBA的人,这个逻辑一直在工作,只花了我几天而没有找到C#,但我只是想出了自己。

以下是我的完整function,目标是从网页获取一段信息:

 private int maxReloadAttempt = 3; private int currentAttempt = 1; private string GetCarrier(string webAddress) { WebBrowser WebBrowser_4MobileCarrier = new WebBrowser(); string innerHtml; string strStartSearchFor = "subtitle block pull-left\">"; string strEndSearchFor = "<"; try { WebBrowser_4MobileCarrier.ScriptErrorsSuppressed = true; WebBrowser_4MobileCarrier.Navigate(webAddress); // MAKE SURE ReadyState = Complete while (WebBrowser_4MobileCarrier.ReadyState.ToString() != "Complete") { Application.DoEvents(); } // LOAD HTML innerHtml = WebBrowser_4MobileCarrier.Document.Body.InnerHtml; // ATTEMPT (x3) TO EXTRACT CARRIER STRING while (currentAttempt <= maxReloadAttempt) { if (innerHtml.IndexOf(strStartSearchFor) >= 0) { currentAttempt = 1; // Reset attempt counter return Sub_String(innerHtml, strStartSearchFor, strEndSearchFor, "0"); // Method: "Sub_String" is my custom function } else { currentAttempt += 1; // Increment attempt counter GetCarrier(webAddress); // Recursive method call } // End if } // End while } // End Try catch //(Exception ex) { } return "Unavailable"; } 

这是一个“快速而肮脏”的解决方案。 它不是100%万无一失,但它不会阻止UI线程,它应该是原型WebBrowser控件自动化程序的满意:

  private async void testButton_Click(object sender, EventArgs e) { await Task.Factory.StartNew( () => { stepTheWeb(() => wb.Navigate("www.yahoo.com")); stepTheWeb(() => wb.Navigate("www.microsoft.com")); stepTheWeb(() => wb.Navigate("asp.net")); stepTheWeb(() => wb.Document.InvokeScript("eval", new[] { "$('p').css('background-color','yellow')" })); bool testFlag = false; stepTheWeb(() => testFlag = wb.DocumentText.Contains("Get Started")); if (testFlag) { /* TODO */ } // ... } ); } private void stepTheWeb(Action task) { this.Invoke(new Action(task)); WebBrowserReadyState rs = WebBrowserReadyState.Interactive; while (rs != WebBrowserReadyState.Complete) { this.Invoke(new Action(() => rs = wb.ReadyState)); System.Threading.Thread.Sleep(300); } } 

这是testButton_Click方法的更通用版本:

  private async void testButton_Click(object sender, EventArgs e) { var actions = new List() { () => wb.Navigate("www.yahoo.com"), () => wb.Navigate("www.microsoft.com"), () => wb.Navigate("asp.net"), () => wb.Document.InvokeScript("eval", new[] { "$('p').css('background-color','yellow')" }), () => { bool testFlag = false; testFlag = wb.DocumentText.Contains("Get Started"); if (testFlag) { /* TODO */ } } //... }; await Task.Factory.StartNew(() => actions.ForEach((x)=> stepTheWeb (x))); } 

[更新]

我通过借用和从该主题中重新调整@Noseratio的NavigateAsync方法来调整我的“快速和脏”样本。 新的代码版本将在UI线程上下文中异步自动执行/执行,不仅是导航操作,还有Javascript / AJAX调用 – 任何“lamdas”/一个自动化步骤任务实现方法。

我们非常欢迎所有代码评论/评论。 特别是来自@Noseratio 。 我们将共同创造这个世界;)

  public enum ActionTypeEnumeration { Navigation = 1, Javascript = 2, UIThreadDependent = 3, UNDEFINED = 99 } public class ActionDescriptor { public Action Action { get; set; } public ActionTypeEnumeration ActionType { get; set; } } ///  /// Executes a set of WebBrowser control's Automation actions ///  ///  /// Test form shoudl ahve the following controls: /// webBrowser1 - WebBrowser, /// testbutton - Button, /// testCheckBox - CheckBox, /// totalHtmlLengthTextBox - TextBox ///  private async void testButton_Click(object sender, EventArgs e) { try { var cts = new CancellationTokenSource(60000); var actions = new List() { new ActionDescriptor() { Action = ()=> wb.Navigate("www.yahoo.com"), ActionType = ActionTypeEnumeration.Navigation} , new ActionDescriptor() { Action = () => wb.Navigate("www.microsoft.com"), ActionType = ActionTypeEnumeration.Navigation} , new ActionDescriptor() { Action = () => wb.Navigate("asp.net"), ActionType = ActionTypeEnumeration.Navigation} , new ActionDescriptor() { Action = () => wb.Document.InvokeScript("eval", new[] { "$('p').css('background-color','yellow')" }), ActionType = ActionTypeEnumeration.Javascript}, new ActionDescriptor() { Action = () => { testCheckBox.Checked = wb.DocumentText.Contains("Get Started"); }, ActionType = ActionTypeEnumeration.UIThreadDependent} //... }; foreach (var action in actions) { string html = await ExecuteWebBrowserAutomationAction(cts.Token, action.Action, action.ActionType); // count HTML web page stats - just for fun int totalLength = 0; Int32.TryParse(totalHtmlLengthTextBox.Text, out totalLength); totalLength += !string.IsNullOrWhiteSpace(html) ? html.Length : 0; totalHtmlLengthTextBox.Text = totalLength.ToString(); } } catch (Exception ex) { MessageBox.Show(ex.Message, "Error"); } } // asynchronous WebBroswer control Automation async Task ExecuteWebBrowserAutomationAction( CancellationToken ct, Action runWebBrowserAutomationAction, ActionTypeEnumeration actionType = ActionTypeEnumeration.UNDEFINED) { var onloadTcs = new TaskCompletionSource(); EventHandler onloadEventHandler = null; WebBrowserDocumentCompletedEventHandler documentCompletedHandler = delegate { // DocumentCompleted may be called several times for the same page, // if the page has frames if (onloadEventHandler != null) return; // so, observe DOM onload event to make sure the document is fully loaded onloadEventHandler = (s, e) => onloadTcs.TrySetResult(true); this.wb.Document.Window.AttachEventHandler("onload", onloadEventHandler); }; this.wb.DocumentCompleted += documentCompletedHandler; try { using (ct.Register(() => onloadTcs.TrySetCanceled(), useSynchronizationContext: true)) { runWebBrowserAutomationAction(); if (actionType == ActionTypeEnumeration.Navigation) { // wait for DOM onload event, throw if cancelled await onloadTcs.Task; } } } finally { this.wb.DocumentCompleted -= documentCompletedHandler; if (onloadEventHandler != null) this.wb.Document.Window.DetachEventHandler("onload", onloadEventHandler); } // the page has fully loaded by now // optional: let the page run its dynamic AJAX code, // we might add another timeout for this loop do { await Task.Delay(500, ct); } while (this.wb.IsBusy); // return the page's HTML content return this.wb.Document.GetElementsByTagName("html")[0].OuterHtml; }