使用发布请求从网站上刮下网站表
我的目标是从此 网页< 获取 PQRI 表(列出的两个表中的第二个表) /a> 使用 Python。
由于它是一个ajax表,我尝试了以下操作:
- 在Chrome中打开网页
- 打开开发者工具->网络-> Fetch/XHR 获取请求 URL、请求 headers 和 Payload。
- 使用请求库发出发布请求:
url = "https://apps.usp.org/ajax/USPNF/columnsDB.php"
headers = {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "en-US,en;q=0.9",
"Connection": "keep-alive",
"Content-Length": "201",
"Content-Type": "application/x-www-form-urlencoded",
"Cookie": "_fbp=fb.1.1646747716384.2068133566; tc_ptid=3U21FqQ3bklFEULP2jijnQ; tc_ptidexpiry=1709819716801; BE_CLA3=p_id%3D8A64RLL6L464RLNNA48664N2RAAAAAAAAH%26bf%3D8d70551f1d08356108a60fc4a2db91d0%26bn%3D1%26bv%3D3.44%26s_expire%3D1648554934915%26s_id%3D8A64RLL6L464RJ2L8J6664N2RAAAAAAAAH; _gid=GA1.2.1041569168.1648468535; _ga_DTGQ04CR27=GS1.1.1648468535.10.0.1648468535.0; USPSESSID=u6i1i80ot1uk49mnauim3o7l37; _ga=GA1.2.1946138806.1646747717; BIGipServerprod_apps.usp.org_http_pool=1271466250.20480.0000",
"Host": "apps.usp.org",
"Origin": "https://apps.usp.org",
"Referer": "https://apps.usp.org/app/USPNF/columnsDB.html",
"sec-ch-ua": "Not A;Brand ;v=99, Chromium;v=99, Google Chrome;v=99",
"sec-ch-ua-mobile" : "?0",
"sec-ch-ua-platform": "Windows",
"Sec-Fetch-Dest": "empty",
"Sec-Fetch-Mode": "cors",
"Sec-Fetch-Site": "same-origin",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.82 Safari/537.36",
"X-Powered-By": "CPAINT v2.1.0 :: http://sf.net/projects/cpaint",
}
payload = {
"cpaint_function": "updatePQRIResults",
"cpaint_argument[]": "Acclaim%20120%20C18",
"cpaint_argument[]": 0,
"cpaint_argument[]": 0,
"cpaint_argument[]": 0,
"cpaint_argument[]": 2.8,
"cpaint_argument[]": 0,
"cpaint_response_type": "OBJECT",
}
response = requests.post(url, data=payload, headers=headers)
但是当我提出请求时,我只得到以下响应:
“
知道我需要更改什么才能获得所需的输出吗?
My goal is to get the PQRI table (second table of the two listed) from this Webpage using Python.
As it is an ajax table, I tried the following:
- Open the webpage in Chrome
- Open developer tools -> Network -> Fetch/XHR to get the request URL, request Headers and Payload.
- Using the request library to make a post request:
url = "https://apps.usp.org/ajax/USPNF/columnsDB.php"
headers = {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "en-US,en;q=0.9",
"Connection": "keep-alive",
"Content-Length": "201",
"Content-Type": "application/x-www-form-urlencoded",
"Cookie": "_fbp=fb.1.1646747716384.2068133566; tc_ptid=3U21FqQ3bklFEULP2jijnQ; tc_ptidexpiry=1709819716801; BE_CLA3=p_id%3D8A64RLL6L464RLNNA48664N2RAAAAAAAAH%26bf%3D8d70551f1d08356108a60fc4a2db91d0%26bn%3D1%26bv%3D3.44%26s_expire%3D1648554934915%26s_id%3D8A64RLL6L464RJ2L8J6664N2RAAAAAAAAH; _gid=GA1.2.1041569168.1648468535; _ga_DTGQ04CR27=GS1.1.1648468535.10.0.1648468535.0; USPSESSID=u6i1i80ot1uk49mnauim3o7l37; _ga=GA1.2.1946138806.1646747717; BIGipServerprod_apps.usp.org_http_pool=1271466250.20480.0000",
"Host": "apps.usp.org",
"Origin": "https://apps.usp.org",
"Referer": "https://apps.usp.org/app/USPNF/columnsDB.html",
"sec-ch-ua": "Not A;Brand ;v=99, Chromium;v=99, Google Chrome;v=99",
"sec-ch-ua-mobile" : "?0",
"sec-ch-ua-platform": "Windows",
"Sec-Fetch-Dest": "empty",
"Sec-Fetch-Mode": "cors",
"Sec-Fetch-Site": "same-origin",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.82 Safari/537.36",
"X-Powered-By": "CPAINT v2.1.0 :: http://sf.net/projects/cpaint",
}
payload = {
"cpaint_function": "updatePQRIResults",
"cpaint_argument[]": "Acclaim%20120%20C18",
"cpaint_argument[]": 0,
"cpaint_argument[]": 0,
"cpaint_argument[]": 0,
"cpaint_argument[]": 2.8,
"cpaint_argument[]": 0,
"cpaint_response_type": "OBJECT",
}
response = requests.post(url, data=payload, headers=headers)
I see the desired output in the developer tool:
But when I make the request I only get the following response:
"<c_start></c_start><c_total></c_total>getPQRIData: No base column '0'\u003cbr\u003e\u000a"
Any idea what I need to change to get the desired output?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您无法将该表单数据作为字典/JSON发送。将其作为字符串发送,应该可以工作:
输出:
You can't send that form data as a dictionary/json. Send it as a string and it should work:
Output: