HttpClient请求像浏览器
当我通过HttpClient类调用网站www.livescore.com时,我总是得到错误“500”。 可能是来自HttpClients的服务器阻止请求。
1)还有其他方法可以从网页上获取HTML吗?
2)如何设置标题以获取HTML内容?
当我在浏览器中设置标题时,我总是得到stange编码的内容。
http_client.DefaultRequestHeaders.TryAddWithoutValidation("Accept", "text/html,application/xhtml+xml,application/xml"); http_client.DefaultRequestHeaders.TryAddWithoutValidation("Accept-Encoding", "gzip, deflate"); http_client.DefaultRequestHeaders.TryAddWithoutValidation("User-Agent", "Mozilla/5.0 (Windows NT 6.2; WOW64; rv:19.0) Gecko/20100101 Firefox/19.0"); http_client.DefaultRequestHeaders.TryAddWithoutValidation("Accept-Charset", "ISO-8859-1");
3)我如何解决这个问题? 有什么建议?
我在C#和HttpClientClass中使用Windows 8 Metro Style App
在这里你 – 请注意你必须按照 mleroy解压缩你得到的gzip编码结果:
private static readonly HttpClient _HttpClient = new HttpClient(); private static async Task GetResponse(string url) { using (var request = new HttpRequestMessage(HttpMethod.Get, new Uri(url))) { request.Headers.TryAddWithoutValidation("Accept", "text/html,application/xhtml+xml,application/xml"); request.Headers.TryAddWithoutValidation("Accept-Encoding", "gzip, deflate"); request.Headers.TryAddWithoutValidation("User-Agent", "Mozilla/5.0 (Windows NT 6.2; WOW64; rv:19.0) Gecko/20100101 Firefox/19.0"); request.Headers.TryAddWithoutValidation("Accept-Charset", "ISO-8859-1"); using (var response = await _HttpClient.SendAsync(request).ConfigureAwait(false)) { response.EnsureSuccessStatusCode(); using (var responseStream = await response.Content.ReadAsStreamAsync().ConfigureAwait(false)) using (var decompressedStream = new GZipStream(responseStream, CompressionMode.Decompress)) using (var streamReader = new StreamReader(decompressedStream)) { return await streamReader.ReadToEndAsync().ConfigureAwait(false); } } } }
打电话如:
var response = await GetResponse("http://www.livescore.com/").ConfigureAwait(false); // or var response = GetResponse("http://www.livescore.com/").Result;
也可以尝试添加压缩支持:
var compressclient = new HttpClient(new HttpClientHandler() { AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip });
这也添加了标题。
根据相同的线程支持现在在Windows Store框架中: http : //social.msdn.microsoft.com/Forums/windowsapps/en-US/429bb65c-5f6b-42e0-840b-1f1ea3626a42/httpclient-data-compression-and -caching?教授=所需
有几点需要注意。
-
该站点要求您提供用户代理,否则它将返回500 HTTP错误。
-
对livescore.com的GET请求以302到livescore.us进行响应。 您需要处理重定向或直接请求livescore.us
- 您需要解压缩gzip压缩的响应
此代码使用.NET 4 Client Profile工作,我会告诉您它是否适合Windowsapp store应用。
var request = (HttpWebRequest)HttpWebRequest.Create("http://www.livescore.com"); request.AllowAutoRedirect = true; request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.57 Safari/537.17"; string content; using (var response = (HttpWebResponse)request.GetResponse()) using (var decompressedStream = new GZipStream(response.GetResponseStream(), CompressionMode.Decompress)) using (var streamReader = new StreamReader(decompressedStream)) { content = streamReader.ReadToEnd(); }
我认为你可以非常肯定他们已经做了一切来阻止开发人员进行屏幕抓取。
如果我使用此代码尝试使用标准C#项目:
var request = WebRequest.Create("http://www.livescore.com "); var response = request.GetResponse();
我收到了这个回复:
The remote server returned an error: (403) Forbidden.