用于产品 ID 管理的 Cassandra Design

发布于 2024-11-04 07:00:39 字数 1365 浏览 0 评论 0原文

我是 Cassandra 的新手,想要开始尝试一个简单的测试

我们使用传统 RDBMS 的数据模型如下

表公司(Id,名称)

表产品(Id,名称,公司 ID)其中公司 ID 是对公司表的 FK 引用

Table ProductInstance (Id, ProductID) 其中 ProductID 是对 Product 表的 FK 引用

Table ProductInstanceRating (Id, ProductInstanceID, Comment) 其中 ProductInstanceID 是对 ProductInstance 表的 FK 引用 关于

Cassandra 的设计方式有何建议?

更新:

我尝试从查询角度查看

要捕获的数据

1) 产品参考号是一个复合键,其中包含:产品名称、产品批号和客户 ID 产品名称:12456 产品批号:PQ23 客户 ID : 879456

那么唯一的产品参考将类似于 12456|PQ23|879456

2) 产品实例将是上述产品的每个实例的唯一哈希值集

,并且产品的每个实例将获得唯一的编号 784A、876T 等,并且该编号对于特定产品参考是唯一的

产品实例参考将类似于 12456|PQ23|879456|784A

3) 每个产品唯一编号可以接收超过 1 个评级

在这种情况下,查询将是类似

查询1) 为了添加/插入产品实例的评级; 获取产品的行,即 12456|PQ23|879456

查询2) 获取该行中的 ProductInstance ID(可能是列族的名称)

查询3) 将评级信息添加为列:值对

如果设计类似于

12456|PQ23|879456 {
      784A{timestamp1:{rating:valueA
                      person name:valueX}
           timestamp2:{rating:valueB
                      person name:valueY}}

      876T{timestamp1:{rating:valueC
                      person name:valueX}
           timestamp2:{rating:valueB
                      person name:valueY}}
}

此后,我们希望:

查询4) 获取所有有实例的产品
查询5) 获取所有具有评级的产品实例
查询6) 获取产品的最高评分
Query7) 获取产品的平均评分

是否有更好、更有效的方法来实现这一点?

I'm new to Cassandra and wanted to start up trying a simple test

Our Data model with traditional RDBMS is as follows

Table Company (Id, Name)

Table Product (Id, Name, Company ID) where Company ID is FK reference to Company table

Table ProductInstance (Id, ProductID) where ProductID is FK reference to Product table

Table ProductInstanceRating (Id, ProductInstanceID, Comment) where ProductInstanceID is FK reference to ProductInstance table

Any suggestions on how the design should be with Cassandra ?

Update:

I tried to look from the querying perspective

Data to be captured

1) Product Reference is a composite key consisting of : Product Name, Product Lot Number, AND Customer ID
Product Name : 12456
Product Lot Number : PQ23
Customer ID : 879456

Then unique Product reference will be something like 12456|PQ23|879456

2) Product Instance will be unique set of hash numbers for every instance of the aforementioned Product

AND each instance of the product will get a unique number
784A, 876T ,etc and this number will be unique for a particular Product Reference

Product Instance reference will be something like 12456|PQ23|879456|784A

3) Each Product Unique number can receive more than 1 rating

In that case, the queries will be something like

Query1) In order to add/insert rating for a Product Instance;
Fetch Row for Product i.e 12456|PQ23|879456

Query2) Fetch the ProductInstance ID within this row (maybe name of column family)

Query3) Add the rating information as a column:value pair

Should the design be something like

12456|PQ23|879456 {
      784A{timestamp1:{rating:valueA
                      person name:valueX}
           timestamp2:{rating:valueB
                      person name:valueY}}

      876T{timestamp1:{rating:valueC
                      person name:valueX}
           timestamp2:{rating:valueB
                      person name:valueY}}
}

Thereafter, we would want to:

Query4) fetch all products that have instances
Query5) fetch all product instance that have ratings
Query6) fetch highest rating for product
Query7) fetch average rating for product

Is there a better and more efficient way to implement this ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

悲念泪 2024-11-11 07:00:39

在 Cassandra 中,您需要考虑希望运行哪些查询(与 RDBMS 不同,在 RDBMS 中您可以创建一个相当抽象的域模型,然后设计 SQL 查询),因为您只能执行行键查找或使用相当有限的辅助查询索引。任何其他索引都需要在列族中显式创建。

因此,非规范化是很常见的。然而,维护数据可能会更加困难,因为 Cassandra 中没有级联删除等功能。

一个候选结构可能是使用 ProductInstance 列族来存储评级,然后使用 Product 列族作为索引来查找给定产品的实例,然后使用 Company 列族作为索引来查找给定产品的产品某特定公司。或者,您可以直接进行非规范化并将“公司”作为“产品”列系列中的一列。

重复一遍,只有当您知道您希望支持哪些查询时,才能确定适合您的“正确”结构。

In Cassandra, you need to think about what queries you wish to run (unlike a RDBMS where you can make a fairly abstract model of the domain and then devise SQL queries afterwards) because you can only do row-key lookups or use fairly limited secondary indexes. Any other indexes need to be explicitly created within your column families.

For this reason it is common to denormalise. However, maintaining the data may then be more difficult, as there are no cascading deletes etc in Cassandra.

One candidate structure might be to have a ProductInstance columnfamily to store the ratings, then a Product columnfamily as an index to find the instances of a given product, then a Company columnfamily as an index to find the products for a particular company. Or you could just denormalise and put the Company as a column in the Product columnfamily.

To repeat, the 'correct' structure for you can only be determined when you know what queries you wish to support.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文