确定另一个站点的流量测量?
我有一个概念性问题。
我想知道 Alexa Internet 等公司如何确定给定网站(不是我自己的)的总体流量和每个唯一页面的流量。我希望得到技术上的回应 - 如果你要设计这个功能(我确信它很复杂,但假设......)你会怎么做?
提前致谢。
I have a conceptual question.
I am wondering how companies such as Alexa Internet determine a given site's (not my own) overall traffic and traffic for each unique page. I would appreciate a technical response - if you were to design this feature (i am sure it is complicated but hypothetically...) how would you go about it?
Thanks in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
一种方法是连接到一个或多个核心路由器。从那里,您可以执行深度数据包检查,以查看流量的去向、访问的页面等。
另一种方法是让人们安装一个浏览器工具栏,记录他们的去向并将该信息提交给您。我认为这就是 Alexa 的工作原理。
第三种方法是让网站所有者安装一些 JavaScript 来执行分析并将数据提交回给您。谷歌就是这样做的。
第四种方法是从执行上述操作之一的公司购买数据。
One way is to be hooked into one or more core routers. From there you could perform deep packet inspection to see where traffic is going, what pages are visited, etc.
Another way is to have people install a browser toolbar which records where they go and submits that information back to you. I think this is how Alexa works.
A third way is to have web site owners install a bit of javascript which performs analytics and submits that data back to you. This is how Google does it.
A fourth way is to buy that data from companies that do one of the above.
Alexa 通过推断使用 Alexa 工具栏或浏览器扩展的互联网人群的浏览会话数据来估算网站流量。这不是真正的随机样本,因此对此类数据的准确性提出了疑问:
http://en.wikipedia.org/wiki/Alexa_Internet#Accuracy_of_ranking_by_the_Alexa_Toolbar
安装 Alexa工具栏会修改浏览器用户代理,因此您可以通过扫描服务器日志中包含适当用户代理字符串的请求来估计向 Alexa 提供数据的网站访问者的百分比。
Alexa estimates website traffic by extrapolating the data from the browsing sessions of the subset of the Internet population who use the Alexa toolbar or browser extensions. This isn't a truly random sample, so questions are raised over the accuracy of such data:
http://en.wikipedia.org/wiki/Alexa_Internet#Accuracy_of_ranking_by_the_Alexa_Toolbar
Installing the Alexa toolbar modifies the browser user-agent, so you can estimate the % of visitors to your site who are contributing data to Alexa by scanning your server logs for requests with the appropriate user-agent strings.