本帖最后由 tdkgn 于 2014-10-19 20:39 编辑
鄙人学生狗,想编个程序从学校网站爬取所有本校学生的基本个人信息,但是处理一个post请求遇到了问题,求各位大神指点。
如图这么一个局部页面,当我输入学号并点击查询的时候,能用IE捕获一个HTTP请求
用学号20131002做测试的时候,捕获的请求正文如下
[AppleScript] 纯文本查看 复制代码 __VIEWSTATE=LH%2F9Re0QExCNH36tUrgN6ObuG94WlYrkvlLkLfPKItV0RToAs5sf9IBRYjNgtwhJJrxvnEAWBFMbHpkaK%2BYN83nB%2Fd7G%2BuQBp%2FUAce76T3lMVFHHZZUfrrw9Q1d%2BJVVQ4s%2F51og28YNQhLfC8shJw9SQSkTzRu%2F3mmzz%2FyENhRdaYl6Ioackw2K4712aFeSsKUQ0D3uo%2BNqqc48y1xkDiiAeggyRqDPbXG5v9I8ITil01PQcWgxDnVGYQdwlO1NFVgMuNZOlXN66O%2BbSTardQGzisKk2gwBqxcVToruXJZn3UkD7fdOm8ShiUetR7NebXKGK6A%3D%3D&__VIEWSTATEGENERATOR=E08963AD&__VIEWSTATEENCRYPTED=&__EVENTVALIDATION=Nufi4NPV%2BFKCbBaC8A1m0WKIp6cyHZ6leygheWI%2Bw%2FuHQn3n6Fm1eWPXwwDsXgq16fFy%2BjavUsKpvVrLnO8elTThAA%2B7xQkeSAqloNIX9TMq2oB1&StudentNo=20131004&StudentName=&Button1=%B2%E9%D1%AF&SearchStyleControl_currentState=False
我的post实现主体贴在下面
[C#] 纯文本查看 复制代码 public string HttpPost(string Url,string strPostdata)
{
StringBuilder content = new StringBuilder();
try
{
// 与指定URL创建HTTP请求
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(Url);
request.UserAgent = "Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; rv:11.0) like Gecko";
request.Method = "POST";
request.Accept = "*/*";
//出于隐私考虑,cookie实际内容“键”、“值”代替
CookieContainer objcok = new CookieContainer();
objcok.Add(new Uri("http://www.yanghua.net/SystemForm/Student/Search/StuInfoSearch.aspx"), new Cookie("键", "值"));
objcok.Add(new Uri("http://www.yanghua.net/SystemForm/Student/Search/StuInfoSearch.aspx"), new Cookie("键", "值"));
objcok.Add(new Uri("http://www.yanghua.net/SystemForm/Student/Search/StuInfoSearch.aspx"), new Cookie("键", "值"));
request.CookieContainer = objcok;
//不保持连接
request.KeepAlive = true;
byte[] buffer = Encoding.Default.GetBytes(strPostdata);
request.ContentLength = buffer.Length;
request.GetRequestStream().Write(buffer, 0, buffer.Length);
// 获取对应HTTP请求的响应
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
// 获取响应流
Stream responseStream = response.GetResponseStream();
// 对接响应流(以"GBK"字符集)
StreamReader sReader = new StreamReader(responseStream, Encoding.GetEncoding("gb2312"));
// 开始读取数据
Char[] sReaderBuffer = new Char[256];
int count = sReader.Read(sReaderBuffer, 0, 256);
while (count > 0)
{
String tempStr = new String(sReaderBuffer, 0, count);
content.Append(tempStr);
count = sReader.Read(sReaderBuffer, 0, 256);
}
// 读取结束
sReader.Close();
}
catch (Exception)
{
content = new StringBuilder("Runtime Error");
}
return content.ToString();
}
但是我把上述请求正文作为postdata传入的时候,得到的响应却不是预期的响应。
预期响应片段:
实际响应片段:
不知道为什么会未找到相关数据。
不知道我有没有把我的问题描述清楚了。如果还需要其他信息,我再提供。希望大神们能给一个解决方案,已经卡了两天了。谨拜谢!
|