硒请求HTTP标题中缺少参考器

发布于 2025-01-22 19:18:22 字数 2123 浏览 0 评论 0 原文

我正在用硒编写一些测试并注意到,标题中缺少 Referer 。 最小示例,用 https://httpbin.org/headers

import selenium.webdriver

options = selenium.webdriver.FirefoxOptions()
options.add_argument('--headless')

profile = selenium.webdriver.FirefoxProfile()
profile.set_preference('devtools.jsonview.enabled', False)

driver = selenium.webdriver.Firefox(firefox_options=options, firefox_profile=profile)
wait = selenium.webdriver.support.ui.WebDriverWait(driver, 10)

driver.get('http://www.python.org')
assert 'Python' in driver.title

url = 'https://httpbin.org/headers'
driver.execute_script('window.location.href = "{}";'.format(url))
wait.until(lambda driver: driver.current_url == url)
print(driver.page_source)

driver.close()

<html><head><link rel="alternate stylesheet" type="text/css" href="resource://content-accessible/plaintext.css" title="Wrap Long Lines"></head><body><pre>{
  "headers": {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", 
    "Accept-Encoding": "gzip, deflate, br", 
    "Accept-Language": "en-US,en;q=0.5", 
    "Connection": "close", 
    "Host": "httpbin.org", 
    "Upgrade-Insecure-Requests": "1", 
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:64.0) Gecko/20100101 Firefox/64.0"
  }
}
</pre></body></html>

我写了以下 没有参考器。 手动执行

window.location.href = "https://httpbin.org/headers"

但是,如果我浏览到任何页面并在Firefox控制台中


,则 Referer die 会按预期出现。正如下面注释中指出的那样,当使用

driver.get("javascript: window.location.href = '{}'".format(url))

而不是

driver.execute_script("window.location.href = '{}';".format(url))

请求时确实包括 Referer 。另外,当使用Chrome而不是Firefox时,这两种方法都包括参考器

因此,主要问题仍然存在:如上所述,使用firefox发送时,请求中 为什么缺少?

I'm writing some tests with Selenium and noticed, that Referer is missing from the headers. I wrote the following minimal example to test this with https://httpbin.org/headers:

import selenium.webdriver

options = selenium.webdriver.FirefoxOptions()
options.add_argument('--headless')

profile = selenium.webdriver.FirefoxProfile()
profile.set_preference('devtools.jsonview.enabled', False)

driver = selenium.webdriver.Firefox(firefox_options=options, firefox_profile=profile)
wait = selenium.webdriver.support.ui.WebDriverWait(driver, 10)

driver.get('http://www.python.org')
assert 'Python' in driver.title

url = 'https://httpbin.org/headers'
driver.execute_script('window.location.href = "{}";'.format(url))
wait.until(lambda driver: driver.current_url == url)
print(driver.page_source)

driver.close()

Which prints:

<html><head><link rel="alternate stylesheet" type="text/css" href="resource://content-accessible/plaintext.css" title="Wrap Long Lines"></head><body><pre>{
  "headers": {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", 
    "Accept-Encoding": "gzip, deflate, br", 
    "Accept-Language": "en-US,en;q=0.5", 
    "Connection": "close", 
    "Host": "httpbin.org", 
    "Upgrade-Insecure-Requests": "1", 
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:64.0) Gecko/20100101 Firefox/64.0"
  }
}
</pre></body></html>

So there is no Referer. However, if I browse to any page and manually execute

window.location.href = "https://httpbin.org/headers"

in the Firefox console, Referer does appear as expected.


As pointed out in the comments below, when using

driver.get("javascript: window.location.href = '{}'".format(url))

instead of

driver.execute_script("window.location.href = '{}';".format(url))

the request does include Referer. Also, when using Chrome instead of Firefox, both methods include Referer.

So the main question still stands: Why is Referer missing in the request when sent with Firefox as described above?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

苏佲洛 2025-01-29 19:18:22

推荐人根据MDN文档

推荐人请求标头包含上一个网页的地址,从该地址遵循了当前请求的页面的链接。 推荐人标题允许服务器识别人们从何处访问它们,并可能将数据用于分析,记录或优化的缓存。

重要:尽管此标头具有许多无辜的用途,但它可能会对用户安全和隐私产生不良后果。

来源:


如果:

,浏览器不会发送推荐人标头

  • 引用资源是本地的“文件”或“数据” uri。
  • 使用了不安全的HTTP请求,并使用安全协议(HTTPS)收到了参考页。

来源:


隐私和安全问题

Referer http Header相关的隐私和安全风险:

推荐人标头包含上一个网页的地址,从中遵循了与当前请求的页面的链接,可以进一步用于分析,记录或优化的缓存。

来源:


Referform header profective从 repective中解决安全问题

,可以按照步骤来降低大多数安全风险:

  • :使用服务器上的推荐人policy 标题来控制通过Referer标头发送的信息。同样,无引用者的指令将完全省略引用者标题。
  • referrerpolicy html元素的属性有泄漏此类信息的危险(例如&lt; img&gt; and code> and &lt; a&gt; a&gt; )。例如
  • rel 属性设置为 noreferrer 在HTML元素上,其危险泄漏此类信息(例如&lt; img&gt; and code>和<代码>&lt; a&gt; )。
  • 退出页面redirect redirect 技术:这是目前唯一可以使用缺陷的方法,就是要有一个您不介意在 Referer 标题内部的退出页面。许多网站实施了此方法,包括Google和Facebook。它没有让转介数据显示私人信息,而仅显示用户来自的网站,如果正确实现。而不是显示为 http://example.com/user/foobar 新推荐程序数据将显示为 http://example.com/exit?url = http%3a,而不是引用数据。 %2F%2FExample.com 。该方法的工作方式是将网站上的所有外部链接放在中介页面上,然后将其重定向到最后页面。下面我们有一个链接示例 ,然后我们编码完整的URL,然后将其添加到我们退出页面的 url 参数。

来源:


yousecase

我已经通过geckodriver/firefox and chromedriver/ 执行了您的代码铬组合:

代码块:

driver.get('http://www.python.org')
assert 'Python' in driver.title

url = 'https://httpbin.org/headers'
driver.execute_script('window.location.href = "{}";'.format(url))
WebDriverWait(driver, 10).until(lambda driver: driver.current_url == url)
print(driver.page_source)

观察:

  • 使用geckodriver/firefox 参考器:“ https://www.python.org/” header 缺少如下:< /p>

      {
          “标头”:{
            “接受”:“ text/html,application/xhtml+xml,application/xml; q = 0.9,*/*; q = 0.8“, 
            “接受编码”:“ gzip,deflate,br”, 
            “接受语言”:“ en-us,en; q = 0.5”,, 
            “主机”:“ httpbin.org”, 
            “升级 - 不必要的重新要求”:“ 1”, 
            “用户代理”:“ Mozilla/5.0(Windows NT 6.2; Win64; X64; RV:67.0)Gecko/20100101 Firefox/67.0
          }
        }
     
  • 使用chromedriver/chrome 推荐人:“ https://www.python.org/” 标题为 present 如下:

      {
          “标头”:{
            “接受”:“ text/html,application/xhtml+xml,application/xml; q = 0.9,image/webp,image/apng,*/*; q = 0.8,application/application/signed-exchange; v = b3“,, 
            “接受编码”:“ gzip,deflate,br”, 
            “接受语言”:“ en-us,en; q = 0.9”,, 
            “主机”:“ httpbin.org”, 
            “推荐人”:“ https://www.python.org/”, 
            “升级 - 不必要的重新要求”:“ 1”, 
            “用户代理”:“ Mozilla/5.0(Windows NT 6.2; Win64; X64)AppleWebkit/537.36(Khtml,像Gecko一样)Chrome/75.0.3770.80 Safari/Safari/537.36”
          }
        }
     

结论:

似乎在处理推荐人标题时,成为Geckodriver/Firefox的问题。


outro

推荐人策略

Referer as per the MDN documentation

The Referer request header contains the address of the previous web page from which a link to the currently requested page was followed. The Referer header allows servers to identify where people are visiting them from and may use that data for analytics, logging, or optimized caching, for example.

Important: Although this header has many innocent uses it can have undesirable consequences for user security and privacy.

Source: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referer


However:

A Referer header is not sent by browsers if:

  • The referring resource is a local "file" or "data" URI.
  • An unsecured HTTP request is used and the referring page was received with a secure protocol (HTTPS).

Source: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referer


Privacy and security concerns

There are some privacy and security risks associated with the Referer HTTP header:

The Referer header contains the address of the previous web page from which a link to the currently requested page was followed, which can be further used for analytics, logging, or optimized caching.

Source: https://developer.mozilla.org/en-US/docs/Web/Security/Referer_header:_privacy_and_security_concerns#The_referrer_problem


Addressing the security concerns

From the Referer header perspective majority of security risks can be mitigated following the steps:

  • Referrer-Policy: Using the Referrer-Policy header on your server to control what information is sent through the Referer header. Again, a directive of no-referrer would omit the Referer header entirely.
  • The referrerpolicy attribute on HTML elements that are in danger of leaking such information (such as <img> and <a>). This can for example be set to no-referrer to stop the Referer header being sent altogether.
  • The rel attribute set to noreferrer on HTML elements that are in danger of leaking such information (such as <img> and <a>).
  • The Exit Page Redirect technique: This is the only method that should work at the moment without flaw is to have an exit page that you don’t mind having inside of the referer header. Many websites implement this method, including Google and Facebook. Instead of having the referrer data show private information, it only shows the website that the user came from, if implemented correctly. Instead of the referrer data appearing as http://example.com/user/foobar the new referrer data will appear as http://example.com/exit?url=http%3A%2F%2Fexample.com. The way the method works is by having all external links on your website go to a intermediary page that then redirects to the final page. Below we have a link to the website example.com and we URL encode the full URL and add it to the url parameter of our exit page.

Sources:


This usecase

I have executed your code through both through GeckoDriver/Firefox and ChromeDriver/Chrome combination:

Code Block:

driver.get('http://www.python.org')
assert 'Python' in driver.title

url = 'https://httpbin.org/headers'
driver.execute_script('window.location.href = "{}";'.format(url))
WebDriverWait(driver, 10).until(lambda driver: driver.current_url == url)
print(driver.page_source)

Observation:

  • Using GeckoDriver/Firefox Referer: "https://www.python.org/" header was missing as follows:

        {
          "headers": {
            "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", 
            "Accept-Encoding": "gzip, deflate, br", 
            "Accept-Language": "en-US,en;q=0.5", 
            "Host": "httpbin.org", 
            "Upgrade-Insecure-Requests": "1", 
            "User-Agent": "Mozilla/5.0 (Windows NT 6.2; Win64; x64; rv:67.0) Gecko/20100101 Firefox/67.0"
          }
        }
    
  • Using ChromeDriver/Chrome Referer: "https://www.python.org/" header was present as follows:

        {
          "headers": {
            "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3", 
            "Accept-Encoding": "gzip, deflate, br", 
            "Accept-Language": "en-US,en;q=0.9", 
            "Host": "httpbin.org", 
            "Referer": "https://www.python.org/", 
            "Upgrade-Insecure-Requests": "1", 
            "User-Agent": "Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.80 Safari/537.36"
          }
        }
    

Conclusion:

It seems to be an issue with GeckoDriver/Firefox in handling the Referer header.


Outro

Referrer Policy

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文