我正在尝试从此链接。我需要编写一个循环来提取此类图的信息,以了解一组特定条件。使用开发人员工具>>网络,我发现 url to此图的数据。数据似乎以XML格式存储。
我尝试了不同的方法,但是我一直遇到403错误。我只想仅提取绘图还是对整个网页提出请求都没关系。我认为问题是Cloudflare启动了。有什么想法我如何能够解决这个问题?任何帮助都是非常适合的。
import urllib.request
url = 'https://hansard.parliament.uk/timeline/query?searchTerm=immigration&startDate=27%2F04%2F2017&endDate=27%2F04%2F2022&house=0&contributionType=&isDebatesSearch=False&memberId='
headers = {'user-agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36'}
req = urllib.request.Request(url, headers=headers)
webpage = urllib.request.urlopen(req).read()
I am trying to extract a graph from this link. I need to write a loop to extractd the info of graphs like this for a set of specific criteria. Using Developers tools >> Network, I found the URL to the data underlying this graph. The data seems to be stored in XML format.
I have tried different approaches, but I keep getting 403 Error. It doesn't matter whether I want to extract just the plot or make a get request for the whole web page. I think the problem is that Cloudflare kicks in. Any idea how I might be able to get around this? Any help is very much appriciated.
import urllib.request
url = 'https://hansard.parliament.uk/timeline/query?searchTerm=immigration&startDate=27%2F04%2F2017&endDate=27%2F04%2F2022&house=0&contributionType=&isDebatesSearch=False&memberId='
headers = {'user-agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36'}
req = urllib.request.Request(url, headers=headers)
webpage = urllib.request.urlopen(req).read()
发布评论