如何建模CQL表，以便可以通过zip_code或zip_code和hash查询它？

发布于 2025-02-06 21:15:18 字数 285 浏览 1 评论 0原文

大家好，我有一个包含哈希作为主键的Cassandra表，另一列包含列表。我想添加另一列名为ZipCode，以便我可以根据Zipcode或ZipCode查询Cassandra，并可以查询

Hash |列表| Zipcode

select * from table where zip_code = '12345';
select * from table where zip_code = '12345' && hash='abcd';

有什么办法可以做到这一点？

原文

Hi all I have a cassandra Table containing Hash as Primary key and another column containing List. I want to add another column named Zipcode such that I can query cassandra based on either zipcode or zipcode and hash

Hash | List | zipcode

select * from table where zip_code = '12345';
select * from table where zip_code = '12345' && hash='abcd';

Is there any way that I could do this?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

夜唯美灬不弃 2025-02-13 21:15:18

卡桑德拉（Cassandra）的建议是，您根据访问模式设计数据表。例如，在您的情况下，您希望通过zipcode 和获得zipcode and hash 的，因此理想情况下，您可以拥有两个这样的表

CREATE TABLE keyspace.table1 (
zipcode text,
field1  text,
field2 text,
PRIMARY KEY (zipcode));

，

 CREATE TABLE keyspace.table2 (
    hashcode text
    zipcode text,
    field1  text,
    field2 text,
    PRIMARY KEY ((hashcode,zipcode)));

然后可能需要重新设计您的表根据您的数据。我建议您在Cassandra 中了解数据模型设计，然后再继续进行。

允许过滤可以使用构造，但其用法取决于您的数据大/小。如果您的数据很大，则避免使用此构造，因为它需要对数据库进行完整的扫描，这在资源和时间方面非常昂贵。

Recommendation in Cassandra is that you design your data tables based on your access patterns. For example in your case you would like to get results by zipcode and by zipcode and hash, so ideally you can have two tables like this

CREATE TABLE keyspace.table1 (
zipcode text,
field1  text,
field2 text,
PRIMARY KEY (zipcode));

and

 CREATE TABLE keyspace.table2 (
    hashcode text
    zipcode text,
    field1  text,
    field2 text,
    PRIMARY KEY ((hashcode,zipcode)));

Then you may be required to redesign your tables based on your data. I recommend you understand data model design in cassandra before proceeding further.

ALLOW FILTERING construct can be used but its usage depends on how big/small is your data. If you have a very large data then avoid using this construct as it will require complete scan of the database which is quite expensive in terms of resources and time.

回复收藏 0 原文

奢望 2025-02-13 21:15:18

可以设计一个可以满足两个应用程序查询的单个表。

在此示例架构中，该表由Hash作为聚类密钥对邮政编码进行分区：

CREATE TABLE table_by_zipcode (
    zipcode int,
    hash text,
    ...
    PRIMARY KEY(zipcode, hash)
)

使用此设计，每个邮政编码可以具有hash的一行或多个行。这是其中包含一些测试数据的表：

 zipcode | hash | intcol | textcol
---------+------+--------+---------
     123 |  abc |      1 |   alice
     123 |  def |      2 |     bob
     123 |  ghi |      3 |  charli
     456 |  tuv |      5 |  banana
     456 |  xyz |      4 |   apple

该表包含两个分区zipcode = 123和zipcode = 456。第一个邮政编码有三个行（abc，def，ghi），第二行有两个行（tuv ，xyz）。

您可以仅使用分区密钥（ZipCode）查询表格，例如：

cqlsh> SELECT * FROM table_by_zipcode WHERE zipcode = 123;

 zipcode | hash | intcol | textcol
---------+------+--------+---------
     123 |  abc |      1 |   alice
     123 |  def |      2 |     bob
     123 |  ghi |      3 |  charli

也可以使用分区键Zipcode和clustering key 查询表格。哈希，例如：

cqlsh> SELECT * FROM table_by_zipcode WHERE zipcode = 123 AND hash = 'abc';

 zipcode | hash | intcol | textcol
---------+------+--------+---------
     123 |  abc |      1 |   alice

欢呼！

It is possible to design a single table that will satisfy both app queries.

In this example schema, the table is partitioned by zip code with hash as the clustering key:

CREATE TABLE table_by_zipcode (
    zipcode int,
    hash text,
    ...
    PRIMARY KEY(zipcode, hash)
)

With this design, each zip code can have one or more rows of hash. Here's the table with some test data in it:

 zipcode | hash | intcol | textcol
---------+------+--------+---------
     123 |  abc |      1 |   alice
     123 |  def |      2 |     bob
     123 |  ghi |      3 |  charli
     456 |  tuv |      5 |  banana
     456 |  xyz |      4 |   apple

The table contains two partitions zipcode = 123 and zipcode = 456. The first zip code has three rows (abc, def, ghi) and the second has two rows (tuv, xyz).

You can query the table using just the partition key (zipcode), for example:

cqlsh> SELECT * FROM table_by_zipcode WHERE zipcode = 123;

 zipcode | hash | intcol | textcol
---------+------+--------+---------
     123 |  abc |      1 |   alice
     123 |  def |      2 |     bob
     123 |  ghi |      3 |  charli

It is also possible to query the table with the partition key zipcode and clustering key hash, for example:

cqlsh> SELECT * FROM table_by_zipcode WHERE zipcode = 123 AND hash = 'abc';

 zipcode | hash | intcol | textcol
---------+------+--------+---------
     123 |  abc |      1 |   alice

Cheers!

回复收藏 0 原文

~没有更多了~