Web 指标应用程序计算访问者在网站上的时间的最佳方法是什么?

发布于 2024-11-19 17:49:58 字数 287 浏览 6 评论 0 原文

我正在开发一个内部网络分析系统,例如Google Analytics,我不太清楚页面停留时间的概念,典型的解释 来自网络的此衡量标准是:

  1. 用户在时间戳:t1 访问页面 A
  2. 用户在时间戳:t2 访问页面 B,(t2 > t1)

那么 A 的页面停留时间是 t2 - t1,B 的页面停留时间是 0

我的问题是:在这种情况下,在计算B的页面停留时间时,我们是否需要检查用户是否从页面A点击了页面B?即B的引用是A?

i am developing an internal web analysis system like Google Analytics, i am not very clear about the concept of page stay time, the typical explanation
of this measure from web is:

  1. user accessed page A at timestamp: t1
  2. user accessed page B at timestamp: t2, (t2 > t1)

then the page stay time for A is t2 - t1, for B is 0

My question is: In this scenario, when calculating page stay time for B, do we need to check whether user click page B from page A? i.e. B's refer is A?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

南风几经秋 2024-11-26 17:49:58

有两种技术可以衡量页面停留时间及其聚合对应的网站停留时间,通过标记进行区分用于记录时间事件对:

  • timestamp

  • 基于 ping

Google Analytics,例如使用前者,特别是 GA 为每个 页面浏览量事件和在用户会话中发生的事务

因此,正如您在问题中指出的那样,Google Analytics 通过对该用户的整个会话历史记录的时间戳增量求和来计算网站停留时间。用户会话中的最后一页没有时间戳,因此不会计算最终时间增量。

这会给“现场时间”指标带来误差,但我仍然认为这是测量技术的最佳可用选择。该技术很容易解释,因此很容易准确理解发生的位置以及它从哪个方向影响报告的指标。换句话说,您知道网站停留时间总是被低估。

其次,可以估计此错误(即估计真实的网站停留时间),因为您拥有用户访问中每个其他页面的可靠页面停留时间。更好的是,从您的网站访问者群体中,您可以获得有关用户在其会话中最后访问的特定页面的平均页面停留时间的数据。

测量页面时间的另一组技术是基于 ping 的。这里,页面中的 JavaScript 以预定的时间间隔重复调用页面 ping 服务器的函数。只要该页面在客户端浏览器上打开,该页面上的 javascript 片段就会调用此 ping 函数。

也许这些技术的关键优势在于它们解决了不计算用户在结束会话的页面上花费的时间的问题。我认为基于 ping 的技术的主要缺点是实施成本较高。这种技术的准确性当然取决于ping 频率——平均测量精度大约是 ping 频率的一半。如果您的 ping 频率为 10 秒,则可以将平均页面停留时间解析为 5 秒。但任何服务器活动都会产生相关的资源成本,因此需要谨慎优化该参数(即 ping 频率)。这就是我所说的“更高的实施成本”。

最近的Episodes 是一个 javascript 库,用于精确测量 javascript(而不是 DOM)事件。这可能对您的分析项目有用。

那么这两种技术哪个更好呢?我怀疑两者的巧妙结合将为您提供最高分辨率,同时页面重量和服务器负载最低。据我所知,实现这种混合系统的唯一分析应用程序是 W3Counter。 [注意:我与此项目没有任何关系或任何形式的协议。]

我没有使用过 W3Counter,但仅基于此功能,我相信它值得考虑。 (不过,我不喜欢“W3Counter”这个名字,它让我认为它是一个验证检查器。)

There are two techniques to measure Time on Page, and its aggregated counterpart Time on Site, distinguished by the markers used to record time-event pairs:

  • timestamp

  • ping-based

Google Analytics, for instances uses the former, in particular, GA records a timestamp for each pageview, event, and transaction that occurs in the user's session.

So exactly as you indicated in your Question, Google Analytics calculates Time on Site by summing the timestamp deltas for that user's entire session history. There is no timestamp for the last page in the user's session, so the final time delta is not calculated.

This introduces error into the Time on Site metric, but i still think it's the best available choice of measurement technique. The technique is simple to explain and therefore simple to understand precisely where the occur occurs and from which direction it influences the reported metric. In other words, you know that Time on Site is always undercounted.

Second, this error can be estimated (i.e., estimate the true Time on Site) because you have reliable Time on Page for every other page in the user's visit. Even better, from your population of site visitors, you have data on the mean Time on Page for the particular page that the user visited last in their session.

The other group of techniques for measuring Time on Page are ping-based. Here javascript in the page repeatedly calls, at a pre-determined time interval, a function that page ping the server. The javascript snippet on the page calls this pinging function as long as that page is open on the client browser.

Perhaps the key advantage of these techniques is that they address the problem of not counting the time that the user spent on the page that they ended their session on. I suppose the primary disadvantage of ping-based techniques is a higher implementation cost. The accuracy of this technique depends of course on ping frequency--average measurement precision is roughly half the ping frequency. If your ping frequency is 10 seconds, you can resolve Time on Page to 5 seconds on average. But any server activity has an associated resource cost so this parameter, i.e., ping frequency needs to be optimized with care. That's what i mean by "higher implementation cost".

A recent blog post by Brian Cray discusses such a solution and provides a javascript snippet for this purpose. In addition, Episodes is a javascript library for accurate measurement of javascript (rather than DOM) events. This might be of use to your analytics project.

So which of these two techniques is better? I suspect a clever combination of the two would give you the highest resolution with the lowest page weight and server load. The only analytics app i am aware of that implements such a hybrid system is W3Counter. [Note: i have no affiliation or agreement of any kind, with this Project.]

I have not used W3Counter, but based on this feature alone, i believe it's worth consideration. (I do not however, like the name, "W3Counter" which causes me to think it's a validation checker.)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文