当前位置：文江博客话题详情

服务器端用户代理检测/嗅探是否不好？

发布于 2024-12-28 06:06:59 字数 394 浏览 2 评论 0原文

客户端用户代理检测已知不好并且不鼓励支持功能检测。但是，根据 HTTP 请求中传入的用户代理字段做出不同的反应也是不好的吗？

一个示例是根据传入的用户代理是移动还是桌面来发送较小或较大的图像。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

樱花细雨 2025-01-04 06:06:59

我认为这取决于你的动机是什么。例如，在移动网络领域，您尝试做的是向用户提供在其平台上看起来合理的东西。当纯粹是为了自己的利益时，为什么要关心用户报告的用户代理？如果他们试图用不同的用户代理来欺骗你，那么他们是唯一受苦的人。当然，主要的问题是误报；这并不完全可靠。

我认为你不应该依赖它本身，但移动开发人员正受到像这样的通用广泛声明的攻击。是的，有很好的替代方案，但是在您可以想象的每种浏览器中，随着确定性开始降低，这些信息实际上在某些时候是有用的。

您当然不会使用任何纯文本标头来使用它来促进访问控制。

当有更好的替代方案时，用户代理检测被认为是不好的，但是将其包含在检测过程中当然没有什么坏处，因为检测过程的确定性会优雅地降低。

我在整个过程中遇到的问题是，我们专注于为用户提供一些合理的东西，但似乎从来不认为在不确定时询问是可以接受的。如果您不确定用户代理，为什么不询问一次并存储呢？您可以使用用户代理作为指导。

因此，总结一下我的想法，本质上，用户代理标头是不可靠的，因此依赖它是不好的。这并不意味着您无法从中提取一定程度的有价值的信息，更可靠的选择会让您处于不确定的状态。一般来说，得出不好的结论是错误的。重要的是你如何处理这些信息，从而决定了它的好坏。

更新

看到您对问题的更新后，我有以下评论要贡献。
我是否想嗅探图像请求并为客户端提供基于用户代理的图像？

如果这是唯一的变量，那么也许它可以工作，但很少有情况是您唯一改变的是图像。我不想检测每个请求，因为我想为客户提供一致的解决方案。这意味着我为他们提供了一个页面，使他们请求正确的资源。该页面为所有集成资源提供了一个统一的解决方案。本文档中的所有变体一起工作以实现特定视图。

我尊重用户代理字符串在视图中更改的可能性是如此之小，似乎不值得担心。然而，采用这一原则还可以减少您需要执行浏览器/平台检测的次数，这只会是有益的。这使您可以更轻松地在客户端上切换视图。如果客户说实际上你的观点是错误的，我是平板电脑而不是手机，你如何纠正这个问题？您为用户提供更好的页面，否则您将需要为图像请求欺骗标头......糟糕的主意。 不要使用用户代理字符串来提供图像等通用资源。

潜在的改进

平台识别是现代网络发展中非常活跃的领域。随着计算变得越来越普遍，平台的差异也越来越大，我们对了解我们所服务的平台的需求也越来越大。我认为在当前条件下这个问题的一般解决方案将落在指纹识别和统计分析上。

考虑这个应用程序 - akinator.com - 请注意对大量稀疏数据进行的统计分析是如何精确得令人恼火。在有限的环境（浏览器配置集）中，您可以想象我们可以向客户端的浏览器询问一些问题。然后我们对某个 n 维特征空间中的响应进行统计分析。使用用户代理作为该空间的一个维度将是有用的且具有自我限制，具体取决于您找到的结果。如果它很大程度上不准确，那么它就会出现很大的价差，而你从中获得的价值将是自我限制的。

当然，您从该统计模型中获得任何价值的能力要求您能够获得一些经过验证的事实。例如，这可能是运行 JavaScript 测试套件来检测客户端 js 功能，或者实际上，在不确定的情况下，您实际上可以要求用户告诉您他们的平台是什么。

--------
如需进一步阅读，我建议您参阅 Mozilla 的这篇文章

https:// /developer.mozilla.org/en-US/docs/Web/HTTP/Browser_detection_using_the_user_agent

如今，查找这些字符串是了解该字符串的唯一方法
设备在提供服务之前在移动设备（分别是平板电脑）上运行
HTML。

I think it depends what your motivation is. For example, in the mobile web sector what you are attempting to do is provide the user with something that looks sensible on their platform. Why be concerned about what user-agent the user is reporting, when it is purely for their own benefit? If they go to the effort of tricking you with a different user-agent, then they are the only person that suffers. The main trouble of course is false positives; it's not entirely reliable.

I follow the argument that you should not rely on it as such, but mobile developers are under attack from generic broad statements like this. Yes there are good alternatives, but across every browser you can imagine, this information can actually be useful at some point as the certainty begins to degrade.

What you certainly don't ever do with any plain-text header is use it to facilitate access control.

User agent detection is considered bad when there are better alternatives, but there is certainly no harm in including it in a detection process which degrades gracefully in certainty.

The issue I have with the whole process is that we are caught up in providing the user something sensible, but never seem to think it's acceptable to ask when you are uncertain. If you are uncertain about the user-agent, why not ask once and store? You can use the user-agent as a guideline.

So to conclude my thoughts, essentially the user-agent header is unreliable, so it is bad to rely on it. This doesn't mean you can't extract a degree of valuable information from it where more reliable options leave you in an uncertain state. In general it's wrong to conclude that it is bad. It's simply what you do with this information that makes it bad or not.

Update

After seeing your updates to the question, I have the following comments to contribute.
Do I want to be sniffing image requests and providing the client with an image based on user agent?

If this is the only variable then maybe it could work, but it's rarely the case that the only thing you are varying is the images. I don't want to detect per request because I want to serve the client a coherent solution. This means I served them a page that causes them to request the correct resources. This page yields a single coherent solution for all of the integrated resources. All variations in this document work together for a particular view.

I respect that the chance of the user-agent string changing mid-view is so slim it doesn't seem worth worrying about. However adopting this principle also reduces the number of times you need to perform browser/platform detection, which can only be beneficial. This allows you to switch views on the client much more easily. If the client says actually you got the view wrong, I am a tablet not a phone, how do you go about correcting that? You serve the user a better page, otherwise you will need to be spoofing headers for your image requests... terrible idea. Don't use the user-agent string to serve generic resources like images.

Potential improvements

Platform identification is a very active area of modern developments in the web. As computing becomes more ubiquitous and platforms vary much more widely, our need to understand the platforms we are serving increases. I think the general solution to this problem under the current conditions is going to fall on fingerprinting and statistical analysis.

Consider this application - akinator.com - Notice how the statistical analysis from a huge set of sparse data is annoyingly accurate. In a limited environment (the set of browser configurations), you can imagine that we could ask the client's browser some questions. We then perform a statistical analysis on the response in some n-dimensional feature space. Using the user-agent as a dimension of this space is going to be useful and self limiting, depending on the results that you find. If it's largely inaccurate then it will see a large spread, and the amount of worth you derive from it will be self limiting.

Of course your ability to derive any value from this statistical model requires you to be able to obtain some verified truths. This could be, for example, running a JavaScript test-suite to detect client side js capabilities, or indeed, in uncertainty, you can actually ask the user to tell you what their platform is.

-------
For further reading I'd refer you to this article by Mozilla

https://developer.mozilla.org/en-US/docs/Web/HTTP/Browser_detection_using_the_user_agent

Today, looking for these strings are the only way to know that the
device runs on a mobile device (resp. a tablet) before serving the
HTML.

回复收藏 0 原文

完美的未来在梦里 2025-01-04 06:06:59

这取决于。使用用户代理作为分支的唯一信号服务器级代码的逻辑往好了说是可疑的，往坏了说是不安全的，但它可以用于定义特定类别浏览器的死记硬背功能，并在提供普通代理时提供内容以满足其需求。

您所描绘的场景完美地说明了这一点。尝试检测移动浏览器并缩小在服务器级别发送给他们的内容完全合适，因为您正在尝试调整用户体验以更好地满足他们的需求（例如，通过提供更小的图像和更好的内容流以适应较小屏幕的限制），同时平衡它们与服务器的需求（发送较小的图像，从而产生更少的负载和更少的线路带宽）。这个策略只需要改进。

您应该始终遵循这里的一些设计原则，以确保您的用户代理检测实践不会被用户认为是可疑的：

始终提供这种能力查看站点的完整版本并相应地规划您的负载配置文件。否则，您将让人们尝试通过更改代理来规避此问题。
始终在创建模态视图时明确定义网站内容的修改。这将消除围绕您可能已做出或未做出的更改的任何FUD。< /p>
始终提供站点备用版本的路径。例如，使用类似 http://mobile.example.org 的内容将人们迁移到移动版本，在设计层面假设当请求此路径时，它是明确< /em> 您的受众所要求的。
奖励向您提供正确代理凭据的用户，为他们提供更好的内容和性能体验。当您预见到用户的需求并在他们正在浏览的网站版本上为他们提供更快的性能时，用户会更高兴。
避免滥用和手动重定向模式。例如，当您检测到他们正在运行 iOS 时，不要用移动应用程序的大型 horking 弹出广告来阻止他们。（诚然，这是我最讨厌的事情。）
永远不要基于用户代理限制对网站区域的访问（而是选择严厉警告用户，如果他们这样做，哪些内容将不起作用）偏离轨道并围绕它起草你的支持政策）。例如，我们中的许多人都深情地记得为“在 Internet Explorer 中运行最佳”的网站更改了代理，禁止所有其他浏览器。如果可以避免的话，您就不应该成为这种不良做法的又一个例子。

简而言之：提供正确的用户代理是用户的决定。您可以使用它来为选择运行普通客户端或不知道更好的客户端的用户定义默认体验。这里的目标是奖励您的用户不提供虚假的用户代理，为他们提供他们需要的选项和他们想要的体验，同时平衡他们的需求和你自己的需求。任何超出这个范围的事情都会导致他们犹豫不决，因此应该被认为是极其可疑的。

您当然可以尝试通过其他方式检测浏览器，但这仍然是一个开放的领域研究。浏览器的功能和指纹随着功能的竞争而变化，目前试图追赶以优化性能通常是棘手的。

我同意这个答案关于统计分析的使用，所以不要误会我的意思。但是，作为在该领域积极工作的人，我可以告诉您没有灵丹妙药可以为您提供完美的分类确定性。然而，启发式方法可以并且将会帮助您更有效地平衡负载，为此，一旦您明确定义了可接受的错误率，浏览器询问策略就可以并且确实对您有用。

It depends. Using the user agent as the sole signal to branch the logic of your server-level code is dubious at best and insecure at worst, but it works for defining the rote capabilities of particular classes of browser and serving content to match their needs when the vanilla agent is supplied.

The scenario you've sketched out is a perfect illustration of this. Attempting to detect mobile browsers and downscale the content you send to them at the server level is entirely appropriate, because you're trying to adapt the user experience to fit their needs better (for example, by providing smaller images and better content flow to fit within the constraints of a smaller screen) while balancing them with the needs of your server (sending smaller images, thus generating less load and less bandwidth over the line). This strategy just needs refinement.

There are a few design principles you should always follow here to ensure your practice of user agent detection isn't seen as dubious by your users:

Always provide the ability to view the full version of your site and plan your load profile accordingly. Otherwise, you will have people attempt to circumvent this by changing their agent.
Always clearly define the modifications of your site content when you create a modal view. This will clear up any FUD surrounding the changes you may or may not have made.
Always provide paths to the alternate versions of your site. For example, use something like http://mobile.example.org for migrating people to the mobile version, making the design-level assumption that when this path is requested, it's been explicitly asked for by your audience.
Reward users for providing their correct agent credentials to you, by offering a better experience for them in terms of content and performance. Users will be happier when you've anticipated their needs and given them snappier performance on the version of the site they're browsing.
Avoid abuse and manual redirection patterns. For example, don't block them with a big horking flyout advertisement for your mobile app when you detect they're running iOS. (Admittedly, this is a pet peeve of mine.)
Never restrict access to areas of the site on a user agent basis (opting instead to sternly warn users about what won't work if they go off the rails and drafting your support policy around it). For example, many of us fondly remember changing our agents for sites "that work best in Internet Explorer," disallowing all other browsers. You shouldn't become one more example of this bad practice if it can be avoided.

In short: providing the correct user agent is a decision by the user. You can use this to define a default experience for users choosing to run their clients plain vanilla or ones that don't know any better. The goal here is to reward your users with not providing a false user agent, by giving them the options they need and the experience they desire while balancing their needs with your own. Anything beyond that will cause them to balk, and as such, should be considered extremely dubious.

You can certainly try to detect the browser by other means, but this is still an area of open research. Browser capabilities and fingerprints change as they compete on features, and attempting to play catch-up to optimize performance is often, currently, intractable.

I concur with this answer on the use of statistical analysis, so don't take me wrong here. But, as someone that actively works in this area, I can tell you there's no magic bullet that will give you perfect classification certainty. Heuristics, however, can and will help you balance load more effectively, and to that end, browser interrogation strategies can and do have use to you once you've clearly defined an acceptable rate for error.

回复收藏 0 原文

橘虞初梦 2025-01-04 06:06:59

在“标准浏览器”场景中，它不错，但是不可靠，因为许多浏览器为用户提供一些配置选项/插件/任何东西来修改用户-代理人。

在这种情况下，我会实现类似于 facebook 的东西 - 他们根据 UA（也可能是其他东西，也称为“指纹分析”）检测是否重定向到移动版本（即 http://m.facebook.com/...) 或不（即 http://www.facebook.com...）。同时，它们提供了一个 URL 参数 m2w 来覆盖此重定向机制。

根据移动运营商的不同，他们甚至可能有一些内容感知代理/缓存，可以动态缩放/重新压缩图像，并在您的终端上显示为“正常”浏览器......

考虑浏览器之外的场景...例如，如果您正在服务某些特定协议（如 WebDAV），这可能是具有某种“特定于平台”行为的唯一选择（例如 OS X 和 Windows 之间的差异）。