请求：代理在People_also_ask模块中不起作用

发布于 2025-01-29 00:31:56 字数 843 浏览 4 评论 0原文

我正在使用People_Also_ask模块从Google抓取搜索结果。该模块本身没有使用代理的方法，但我在模块中手动添加了代理。当我从Google被阻止时，我打印了状态，并打印了我的IP地址发送请求。我在People_also_ask模块中添加的代码使用代理是

            proxies = {
                    'http' : "http://username:passward@ip:port"
                        }
            response = SESSION.get(URL, params=params, headers=HEADERS, proxies=proxies)

。我知道这是一种非法活动，但我想知道为什么它主要用于教育目的。我认为提取数据的代码是无关紧要的，因此我添加了简单的代码来使用PEOPLE_ALSO_ASK模块

import people_also_ask as paa
queries = ["how to boil eggs","how to make cake","price of poco f1","price of wooden table","best soap in us","how much tesla worth"]
for query in queries:
    questions = paa.get_related_questions(query ,40)

注意：更改是在people_also_people Module

<强>注意：我正在从浏览器进行搜索，没有任何问题。为什么Google允许我使用Google，但被阻止使用脚本

原文

I am scraping search results from google using people_also_ask module. The module itself dont have method to use proxies but I manually added proxies in the module. When I got blocked from google I printed the status and it was printing my ip address was banned from sending requests. The code I added in people_also_ask module to use proxies is

            proxies = {
                    'http' : "http://username:passward@ip:port"
                        }
            response = SESSION.get(URL, params=params, headers=HEADERS, proxies=proxies)

.I know it is an illegal activity but I want to know why it happens for education purpose mainly. I think the code to extract the data is irrelevant so I am adding simple code to send request using people_also_ask module

import people_also_ask as paa
queries = ["how to boil eggs","how to make cake","price of poco f1","price of wooden table","best soap in us","how much tesla worth"]
for query in queries:
    questions = paa.get_related_questions(query ,40)

Note: The changes are made in first function named search() of google.py of people_also_people module

Note: I am doing searchs from browser without any problem. why is google allowing me to use google but blocked from using the script

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

撩心不撩汉 2025-02-05 00:31:56

答案很简单。尽管这是一项代理服务，但不能保证100％的匿名性。当您通过代理服务器发送HTTP获取请求时，您的程序发送给代理服务器的请求是：

GET http://www.whatsmybrowser.org/ HTTP/1.1
Host: www.whatsmybrowser.org
Connection: keep-alive
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: python-requests/2.10.0

现在，当代理服务器将此请求发送到实际目的地时，它将发送：

GET http://www.whatsmybrowser.org/ HTTP/1.1
Host: www.whatsmybrowser.org
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: python-requests/2.10.0
Via: 1.1 naxserver (squid/3.1.8)
X-Forwarded-For: 122.126.64.43
Cache-Control: max-age=18000
Connection: keep-alive

您可以看到，它会抛出您的IP （在我的情况下，122.126.64.43）在HTTP标题中：X-Forwarded-Fored，因此该网站知道该请求是代表122.126.64.43发送的请求，

请阅读有关此标头的更多信息： https://www.rfc-editor.org/rfc/rfc/rfc7239

/www.rfc-editor.org/rfc/rfc/rfc7239“ rel =“ nofollow noreferrer 想要禁用设置X-Forwarded-for标头，请读： http：http：// www。 squid-cache.org/doc/config/forwarded_for/

我对答案没有任何荣誉，我从以下帖子中复制了此答案，我找到了 python请求模块 - 代理不起作用

The answer is quite simple. Although it is a proxy service, it doesn't guarantee 100% anonymity. When you send the HTTP GET request via the proxy server, the request sent by your program to the proxy server is:

GET http://www.whatsmybrowser.org/ HTTP/1.1
Host: www.whatsmybrowser.org
Connection: keep-alive
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: python-requests/2.10.0

Now, when the proxy server sends this request to the actual destination, it sends:

GET http://www.whatsmybrowser.org/ HTTP/1.1
Host: www.whatsmybrowser.org
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: python-requests/2.10.0
Via: 1.1 naxserver (squid/3.1.8)
X-Forwarded-For: 122.126.64.43
Cache-Control: max-age=18000
Connection: keep-alive

As you can see, it throws your IP (in my case, 122.126.64.43) in the HTTP header: X-Forwarded-For and hence the website knows that the request was sent on behalf of 122.126.64.43

Read more about this header at: https://www.rfc-editor.org/rfc/rfc7239

If you want to host your own squid proxy server and want to disable setting X-Forwarded-For header, read: http://www.squid-cache.org/Doc/config/forwarded_for/

I dont get any credit for the answer I copied this answer from the following post I found Python Requests module - proxy not working

回复收藏 0 原文

~没有更多了~