公开可用的 Web 代理转发缓存日志/数据集
我希望对客户端和 Web 服务器之间发生的 HTTP 请求进行一些分析。是否有任何最近(至少在过去 4 年内)公开可用的 Web 代理转发缓存日志数据集,例如由 Squid 代理记录的数据集?我最感兴趣的是转发缓存 HTTP 日志数据 - 因此来自位于许多客户端和许多服务器之间的缓存。我对反向代理数据有辅助兴趣,例如代表单个服务器提供 HTTP 响应的代理,尽管跨越许多客户端和许多服务器的代理日志会更好。
我基本上追求尽可能多的数据,并且数据中代表的客户数量越多越好。我想大学/大公司可能有这样的数据日志,尽管无法找到任何公开可用的数据(因此这个问题)。
谢谢。
I'm looking to do some analysis on HTTP requests that occur between clients and web servers. Are there any recent (at least within last 4 years) publicly available data sets of web proxy forward cache logs, such as those recorded by a Squid proxy? I'm most interested in forward cache HTTP log data - so coming from a cache that sits between many clients and many servers. I'd have an auxiliary interest in reverse proxy data, such as a proxy that serves up HTTP responses on behalf of a single server, though a proxy log that spans many clients and many servers would be preferable.
I'm after basically as much data as I can get and the larger the number of clients represented in the data the better. I imagine universities/large corporations might have such data logs, though haven't been able to find any publicly available (and hence this question).
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
它曾经很常见,例如 NLANR 跟踪、DEC 跟踪等。然而,在过去几年中,似乎没有人愿意共享跟踪,也许是出于隐私问题(即使客户端 ip、cookie 匿名化)和网址)。
请参阅http://www.web-caching.com/traces-logs.html 对于一些年纪较大的人。
It used to be quote common, e.g., the NLANR traces, the DEC traces, etc. However, in the last few years no-one seems willing to share traces, perhaps because of privacy concerns (even with anonymisation of the client ip, cookies and URL).
See http://www.web-caching.com/traces-logs.html for some older ones.