Google Analytics API - 指标的选择会影响返回的维度值吗?
早上好。我在 Google Analytics API 中看到过这种行为,作为一名 SQL 人员,我觉得这很奇怪。我想获取 adContent 的所有值的列表,因此我查询 ga:adContent
并(因为我还必须选择一个指标,没有明确的原因)ga:organicSearches
。它位于同一组(营销活动)中,因此也许它在服务器上的表现会更好。
我得到一行:adContent 是“(未设置)”,organicSearches 是 516,674。嗯,我猜 adContent 没有被使用。但营销部门发誓确实如此,并制作了一些令人信服的屏幕截图。
后来,我随意将指标更改为 ga:transactions。在我醒来的宇宙中,除了该列中返回的实际值之外,这绝对不会对任何事情产生影响。相反,我得到了无数行,其中包含 ga:adContent 的合理值。 ga:transactions 的值有时为零,因此 GA 不会过滤“指标 > 0”。
我的查询中没有过滤器。我没有更改这两个变体之间的日期范围。谁能告诉我发生了什么事吗?我希望上面的查询能够转换成这样的结果,它应该返回完全相同的行数:
SELECT adContent, SUM(organicSearches)
FROM Campaign
WHERE Date BETWEEN X AND Y
GROUP BY adContent
SELECT adContent, SUM(transactions)
FROM Campaign INNER JOIN ECommerce ON <something>
WHERE Date BETWEEN X AND Y
GROUP BY adContent
我意识到 GA 可能没有在后端使用普通的 RDMS,但在任何数据库中 1 + 1 仍然等于 2 !
Good morning. I've seen this behavior in the Google Analytics API, which as a SQL guy I find bizarre. I'd like to get a list of all values for adContent, so I query ga:adContent
and (because I must also select a metric, for no well-defined reason) ga:organicSearches
. It's in the same group (Campaign), so maybe it'll perform better back on the server.
I get one row: adContent is "(not set)", organicSearches is 516,674. Huh, I guess adContent isn't being used. But the marketing department swears that it is, and produce some convincing screen shots.
Later on, I arbitrarily change the metric to ga:transactions
. In the universe I woke up in, this should have absolutely no impact on anything, except the actual value returned in that column. Instead, I get zillions of rows, with plausible values for ga:adContent
. The value for ga:transactions
is sometimes zero, so it's not the case that GA was filtering for "metric > 0".
There are no filters in my query. I did not change the date range between these two variants. Can anyone tell me what's going on? I expect the above queries to translate to something like this, which should return exactly the same number of rows:
SELECT adContent, SUM(organicSearches)
FROM Campaign
WHERE Date BETWEEN X AND Y
GROUP BY adContent
SELECT adContent, SUM(transactions)
FROM Campaign INNER JOIN ECommerce ON <something>
WHERE Date BETWEEN X AND Y
GROUP BY adContent
I realize that GA probably isn't using an ordinary RDMS on the back end, but surely 1 + 1 still equals 2 in any database!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
根据定义,
ga:organicSearches
几乎永远不会与ga:adContent
匹配(排除极端情况)。ga:adContent
用于广告内容,其中ga:organicSearches
用于会话内的自然搜索结果访问(例如,如果您在同一会话中多次使用 Google)尝试在网站上查找特定内容)。除了尝试测量特定现象之外,请勿将其用于任何其他用途。尽量不要在这里使用 SQL 思维框架; Google Analytics(分析)不在后端使用 SQL,因此您对传统关系的概念不适用。 IIRC,他们使用了一些东西,其中有一个 BigTable 变体,它是一个 NoSQL - 类型数据库。
来自 20 年关于 BigTable 的 Google 论文06:
如果您想要所有维度列表的指标的最小公分母,请使用
ga:pageviews
。By definition
ga:organicSearches
will almost never have any matches forga:adContent
(edge cases aside).ga:adContent
is for the content of an advertisement, wherega:organicSearches
is for organic search result visits within a session (like if you use Google multiple times within the same session to try to find something specific on a site). Don't use it for anything besides trying to measure that particular phenomenon.Try not to use an SQL mindframe here; Google Analytics doesn't use SQL on the backend, so the notions you have of traditional relationships aren't applicable. IIRC, they use a few things, amongst them a BigTable variant, which is a NoSQL-type database.
From a Google Paper on BigTable from 2006:
If you want the lowest common denominator for a metric for a list of all dimensions, use
ga:pageviews
.