查询 Cassandra 列族中 X 天内未更新的行

发布于 2024-08-27 23:23:25 字数 791 浏览 6 评论 0原文

我正在将现有的基于 MySQL 的应用程序迁移到 Cassandra。到目前为止，找到等效的 Cassandra 数据模型非常容易，但我偶然发现了以下问题，我希望得到一些建议：

考虑一个包含数百万个实体的 MySQL 表：

CREATE TABLE entities (
  id INT AUTO_INCREMENT NOT NULL,
  entity_information VARCHAR(...),
  entity_last_updated DATETIME,
  PRIMARY KEY (id),
  KEY (entity_last_updated)
);

每五分钟就会查询该表以查找以下实体：需要更新：

 SELECT id FROM entities 
  WHERE entity_last_updated IS NULL 
     OR entity_last_updated < DATE_ADD(NOW(), INTERVAL -7*24 HOUR)
  ORDER BY entity_last_updated ASC;

然后使用以下查询更新此查询返回的实体：

 UPDATE entities 
    SET entity_information = ?, 
        entity_last_updated = NOW()
  WHERE id = ?;

相应的 Cassandra 数据模型是什么，它允许我存储给定的信息并有效地查询实体表以查找需要更新的实体（即：过去 7 天内未更新的实体）？

原文

I'm moving an existing MySQL based application over to Cassandra. So far finding the equivalent Cassandra data model has been quite easy, but I've stumbled on the following problem for which I'd appreciate some input:

Consider a MySQL table holding millions of entities:

CREATE TABLE entities (
  id INT AUTO_INCREMENT NOT NULL,
  entity_information VARCHAR(...),
  entity_last_updated DATETIME,
  PRIMARY KEY (id),
  KEY (entity_last_updated)
);

Every five minutes the table is queried for entities that need to be updated:

 SELECT id FROM entities 
  WHERE entity_last_updated IS NULL 
     OR entity_last_updated < DATE_ADD(NOW(), INTERVAL -7*24 HOUR)
  ORDER BY entity_last_updated ASC;

The entities returned by this queries are then updated using the following query:

 UPDATE entities 
    SET entity_information = ?, 
        entity_last_updated = NOW()
  WHERE id = ?;

What would be the corresponding Cassandra data model that would allow me to store the given information and effectively query the entities table for entities that need to be updated (that is: entities that have not been updated in the last seven days)?

分享到QQ

分享到微博