App Engine数据建模问题

发布于 2024-09-14 10:21:51 字数 457 浏览 5 评论 0原文

我对数据模型建模以进行有效查询有点陌生 GAE,但对 RDBMS 有广泛的了解。

问题是这样的: 我有大约一百万个术语(字符串),需要查询和比较 将数值作为时间序列与每周数据点相关联。 将其视为 X 轴为时间、Y 轴为线性的图表 显示数值测量。

到目前为止,我得到了每个学期和的离散数据点 数据存储区,我正在寻找一种按周聚合数据的方法 并存储数据,以便我可以有效地查询数据存储。 我正在考虑预先计算一些不同的时间序列 每个学期的长度(4周、5周、6周等)并存储每个条目 as {term, start_week, [time series]}

使用 RDBMS,我可以轻松地按周分组并以编程方式(作为存储过程或在应用程序后端)创建数据系列。由于 GAE 的限制以及 BigTable 作为高度分布式系统的性质,这不是一个选择。

任何想法都受到高度赞赏!

I'm kinda new to modeling my data model for efficient querying with
GAE, but have extensive knowledge with RDBMS.

Here's the problem:
I got roughly a million terms (strings) and need to query and compare
associated numerical values as a time series with weekly data points.
Think of it as a graph with time on the X axis and a linear Y axis
showing the numerical measures.

So far I got the discrete data points per term and day in the
datastore, and I'm looking for a way to aggregate the data by week
and store the data so that I can query the datastore efficiently.
I was thinking of precalculating a number of time series of different
length (4 weeks, 5 weeks, 6 weeks etc.) per term and store each entry
as {term, start_week, [time series]}

With RDBMS I could easily group by week and create the data series programatically, either as stored procedure or in the application back end. Due to GAE constraints and the nature of BigTable as a highly distributed system, this is not an option.

Any ideas are highly appreciated!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

赠我空喜 2024-09-21 10:21:51

您所采用的方法似乎是合理的,但这完全取决于您需要执行的查询类型。假设您需要按名称(字符串)和周查找时间序列,并且您通常希望获取连续 1 到 100 周之间的数据,我建议如下:

  • 为每周的数据使用一个实体对于每个术语,正如您所建议的那样,
  • 不要存储“松散”的数据并定期聚合数据,而是直接以这种形式存储新点。每当您收到新数据点时,如果它是本周的第一个点,请创建一个新实体。如果不是,请检索该周的现有实体并将您的数据点附加到其中。
  • 当您想要绘制数据时,查询您需要的术语和时间段,并按时间顺序获取结果。

The approach you're heading towards seems reasonable, but it all depends on the sort of queries you need to execute. Assuming you need to look up time series by name (string) and week, and you generally want to fetch between, say, 1 and 100 consecutive weeks worth of data, I would suggest the following:

  • Have one entity for each week's worth of data for each term, as you suggest
  • Instead of storing the data 'loose' and aggregating it periodically, store new points directly in this form. Whenever you receive a new data point, if it's the first point of the week, create a new entity. If it's not, retrieve the existing entity for that week and append your data point to it.
  • When you want to plot data, query for the term and time period you need, and fetch the results in time order.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文