使用Python检查代理

发布于 2025-02-11 05:07:43 字数 924 浏览 0 评论 0原文

我想检查代理是否使用过Python。为此,我的方法是使用请求:

import requests

proxies = []

for i in proxies:
    prox = {"prox": f"http://{i}"}
    r = requests.get("http://google.es", proxies=prox, timeout=5)

    latency = r.elapsed 
    latency = int(latency.total_seconds() * 1000)
    print(r.status_code) 

所以我对此解决方案感到满意,但是后来我发现使用不同的方法是不同的。使用此请求代码,我获得了status_code = 200和LETENCIES< 100个代理列表。但是,如果我使用基于pycurl的ProxyChecker:

from proxy_checker import ProxyChecker

for i in proxies:
    print(i)
    a = checker.check_proxy(i)
    print(a)

对于同一列表,只有10个中的3个工作。我的问题是:为什么会有这种差异?我的请求方法怎么了?还是为什么请求说它使用这些代理找到了Google?

编辑 请勿使用proxychecker版本0.6。它已经过时了,50%的时间不起作用。检查此版本 https://github.com/scolymus/scolymus/proxy-checker-checker-proxy-checker-python

I want to check if a proxy is alived or not using python. To do that, my approach was to use request like:

import requests

proxies = []

for i in proxies:
    prox = {"prox": f"http://{i}"}
    r = requests.get("http://google.es", proxies=prox, timeout=5)

    latency = r.elapsed 
    latency = int(latency.total_seconds() * 1000)
    print(r.status_code) 

So I was happy with this solution, but then I found out that using a different approach the result was different. With this request code I obtain status_code = 200 and latencies < 100 for a list of 10 proxies. However, if I use ProxyChecker which is based on pycurl:

from proxy_checker import ProxyChecker

for i in proxies:
    print(i)
    a = checker.check_proxy(i)
    print(a)

for the same list only 3 out of 10 are working. My question is: Why is there this difference? What's wrong with my request approach? or why requests says it found google using these proxies?

EDIT
Do not use ProxyChecker version 0.6. It's outdated and 50% of the time does not work. Check this version https://github.com/Scolymus/proxy-checker-python

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

ˇ宁静的妩媚 2025-02-18 05:07:43

正如我所知,其原因是:

代理检查器正在使用具有特定值(LEL)的自定义标头。默认情况下,请求不使用此标头。
这是代理检查代码:

def check_proxy(self, proxy):
    if not proxy:
        return None
    try:
        #if self.headers:
        #    curl.setopt(pycurl.HTTPHEADER, self.headers.items())
        curl.setopt(pycurl.PROXY, proxy)
        curl.setopt(pycurl.USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13')
        curl.setopt(pycurl.URL, self.check_url)
        curl.perform()

        if curl.getinfo(pycurl.HTTP_CODE) == 200:
            return True
        else:
            return False
    except pycurl.error as e:
        return False

您可以通过将其添加到标题字典中来测试使用相同自定义标头的相同代理:

    ...
    headers = {'lel': 'lel'}
    r = requests.get("http://google.es", headers=headers, proxies=prox, timeout=5)
    ...

As I was told, the reason is the following:

The proxy-checker is using a custom header, which has a specific value (lel). Requests is not using this header by default.
Here is the proxy-checker code:

def check_proxy(self, proxy):
    if not proxy:
        return None
    try:
        #if self.headers:
        #    curl.setopt(pycurl.HTTPHEADER, self.headers.items())
        curl.setopt(pycurl.PROXY, proxy)
        curl.setopt(pycurl.USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13')
        curl.setopt(pycurl.URL, self.check_url)
        curl.perform()

        if curl.getinfo(pycurl.HTTP_CODE) == 200:
            return True
        else:
            return False
    except pycurl.error as e:
        return False

You can test the same proxies with the same custom header by adding it to your headers dictionary:

    ...
    headers = {'lel': 'lel'}
    r = requests.get("http://google.es", headers=headers, proxies=prox, timeout=5)
    ...
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文