是否有对“截至”的内置支持?又名“日期生效” Cassandra 中的数据检索
我想连续定义一些数据,本质上是主/详细信息。出于说明目的,我们假设它是一个足球联赛“A 级联赛”,其中有一些球队。随着时间的推移,球队可能会被添加或删除,或者他们的属性可能会发生变化,或者联盟的属性可能会发生变化。
在 HBase 中,我可以定义这样的表结构:
Column Family: League
Column: Name
Column: Sponsor
Column Family: Team
Column: team-A
Column: team-B etc
现在,如果我添加以下数据(抱歉使用非标准符号,我简化了时间戳以仅显示日期部分):
Key Timestamp League { Name Sponsor }
LG-A 2011-01-01 League-A Big-Co
Key Timestamp Team
LG-A 2011-01-01 "The Blues": blues-data
LG-A 2011-01-01 "The Greens": greens-data
LG-A 2011-03-01 "The Reds": reds-data
LG-A 2011-03-10 "The Greens": greens-data2
我想查询 LG 的数据-A 指定时间 2011-03-10 并得到结果:
Key Timestamp Team
LG-A 2011-01-01 "The Blues": blues-data
LG-A 2011-03-01 "The Reds": reds-data
LG-A 2011-03-10 "The Greens": greens-data2
同样,当我查询 LG-A 指定时间 2011-02-01 时,我得到结果:
Key Timestamp Team
LG-A 2011-01-01 "The Blues": blues-data
LG-A 2011-01-01 "The Greens": greens-data
这是在 HBase 中通过在放置时设置时间戳来完成的数据,然后设置获取操作的时间范围。
这可以在 Cassandra 中轻松完成吗?到目前为止,我只遇到过使用时间戳值编写列名来存储时间序列数据的建议,但这似乎并没有提供上面说明的功能(例如,我只想在结果中包含“The Greens”的一个条目,即在指定的时间戳有效。
I want to define some data in a row, which is master/detail in nature. For illustration purposes let's say its a football league 'league-A' with a collection of teams. Teams can be added or removed over time, or their attributes may change, or the league's attributes may change.
In HBase I can define a table structure like this:
Column Family: League
Column: Name
Column: Sponsor
Column Family: Team
Column: team-A
Column: team-B etc
Now if I add the following data (sorry for the non-standard notation, and I've simplified the timestamps to show just the date part):
Key Timestamp League { Name Sponsor }
LG-A 2011-01-01 League-A Big-Co
Key Timestamp Team
LG-A 2011-01-01 "The Blues": blues-data
LG-A 2011-01-01 "The Greens": greens-data
LG-A 2011-03-01 "The Reds": reds-data
LG-A 2011-03-10 "The Greens": greens-data2
I want to query the data for LG-A specifying time 2011-03-10 and get the result:
Key Timestamp Team
LG-A 2011-01-01 "The Blues": blues-data
LG-A 2011-03-01 "The Reds": reds-data
LG-A 2011-03-10 "The Greens": greens-data2
Similarly when I query for LG-A specifying time 2011-02-01 I get the result:
Key Timestamp Team
LG-A 2011-01-01 "The Blues": blues-data
LG-A 2011-01-01 "The Greens": greens-data
This is done in HBase by setting the timestamp when putting the data, and then setting the time range for the Get operation.
Can this be done in Cassandra easily? So far I have only come across suggestions to write column names using timestamp values to store timeseries data, but this does not seem to give the features illustrated above (e.g. I only want one entry for "The Greens" in my result, the one that's effective at the timestamp specified.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我相当确定这会导致完整的顺序扫描。
您可以在 Cassandra 中手动进行 seq 扫描,也可以创建一个包含时间戳的列并为其建立索引。请参阅http://www.datastax.com/dev/ blog/whats-new-cassandra-07-secondary-indexes 为例(使用非 ts 数据,但原理相同)。
I'm reasonably sure this results in a full sequential scan.
You can manually seq scan in Cassandra, or you can create a column with your timestamp in it and index that. See http://www.datastax.com/dev/blog/whats-new-cassandra-07-secondary-indexes for an example (with non-ts data but the principle is the same).