BigQuery和Google Analytics(分析)4在浏览量方面的分歧 - 为什么?

发布于 2025-02-12 20:22:15 字数 2044 浏览 3 评论 0原文

我在大查询中有一张Google Analytics(GA4)事件的大型桌子,用于我所照顾的许多网站。 The table has the following schema:

field nametype
event_datedate
event_timestampinteger
event_namestring
event_keystring
event_string_valuestring
event_int_valueinteger
event_float_valuefloat
event_double_valuefloat
user_pseudo_idstring
user_first_touch_timestampinteger
device_categorystring
device_model_namestring
device_host_name字符串
device_web_hostman字符串
geo_country字符串
geo_citystring
clign_source_namestring

我查询表以获取表格查看特定站点的pageviews的总数:使用以下查询:

with date_range as (
select
    '20220601' as start_date,
    '20220630' as end_date)

select
    count(distinct case when event_name = 'page_view' then concat(user_pseudo_id, cast(event_timestamp as string)) end) as pageviews

from
    `project_name.datset_name.table_name`,
    date_range
WHERE
    event_date BETWEEN PARSE_DATE('%Y%m%d',date_range.start_date) AND PARSE_DATE('%Y%m%d',date_range.end_date)
AND device_web_hostname in ("www.website_name.com")

对我来说是一个谜:当我为某些站点做这个网站时,page_views的图是__views的数字数百个页面浏览量。最大的查询数字更高。有趣的是:

  • 如果我尝试其他事件,例如会议,那么就没有
  • 说明的问题,只有某些网站,而不是

我所知道的所有内容:

  • 这些数字永远不会同意,但是它们不应该'几百个
  • GA4都有未加工的数据,因此我查询数据的方式与我尝试过的GA4接口中的处理方式不同

  • 查看GA4文档,以查看页面浏览量的方式二手/处理;我看不到任何可以启发我
  • 调试每个站点以确保标签正确触发的东西;他们是

我的墙壁有点墙,如果有人有任何洞察力将我指向另一个可能的方向,我会很高兴。提前致谢!

I have a large table of Google Analytics 4 (GA4) events in Big Query for a bunch of websites I look after. The table has the following schema:

field nametype
event_datedate
event_timestampinteger
event_namestring
event_keystring
event_string_valuestring
event_int_valueinteger
event_float_valuefloat
event_double_valuefloat
user_pseudo_idstring
user_first_touch_timestampinteger
device_categorystring
device_model_namestring
device_host_namestring
device_web_hostmanstring
geo_countrystring
geo_citystring
traffic_source_namestring

I query the table to get the total number for pageviews for a specific site using the following query:

with date_range as (
select
    '20220601' as start_date,
    '20220630' as end_date)

select
    count(distinct case when event_name = 'page_view' then concat(user_pseudo_id, cast(event_timestamp as string)) end) as pageviews

from
    `project_name.datset_name.table_name`,
    date_range
WHERE
    event_date BETWEEN PARSE_DATE('%Y%m%d',date_range.start_date) AND PARSE_DATE('%Y%m%d',date_range.end_date)
AND device_web_hostname in ("www.website_name.com")

What is a mystery to me is that when I do this for some sites, the figure for page_views is out by several hundred pageviews. The Big Query figure is higher. What is interesting is that:

  • If I try other events, such as sessions then there are no issues
  • As stated, it is only for some sites and not all

I know enought to know:

  • These numbers are never going to agree, but they shouldn't be out by several hundred either
  • GA4 has the unprocessed data, so the way I am querying the data is different to how it is being processed in the GA4 interface

I have tried:

  • Looking at the GA4 documentation to see how pageviews are used/processed; I can't see anything that enlightens me
  • Debugging each site to make sure tags are firing correctly; they are

I've hit a bit of a wall with this and I'd begrateful if anyone has any insight to point me in another possible direction. Thanks in advance!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

像极了他 2025-02-19 20:22:15

问题在于代码的下一部分:

select
    count(distinct case when event_name = 'page_view' then concat(user_pseudo_id, cast(event_timestamp as string)) end) as pageviews

您对user_pseudo_id和event_timestamp的contecat进行了不同。您还需要在此基础上进行session_id才能获得唯一的命中。

The issue lies in this following part of the code:

select
    count(distinct case when event_name = 'page_view' then concat(user_pseudo_id, cast(event_timestamp as string)) end) as pageviews

You are counting distinct for concat of user_pseudo_id and event_timestamp which is not unique. You need to also have session_id on top of that to get a unique hit.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文