如何设计Hbase架构？

发布于 2024-07-10 16:10:34 字数 1000 浏览 5 评论 0原文

假设我有这个 RDBM 表（Entity-attribute-value_model）

col1: entityID
col2: attributeName
col3: value

：由于扩展问题，我想使用 HBase。

我知道访问 Hbase 表的唯一方法是使用主键（游标）。您可以获得特定键的游标，并逐一迭代行。

问题是，就我而言，我希望能够迭代所有 3 列。例如：

对于给定的entityID，我想获取它的所有属性和值，
对于给定的attributeName和值，我想获取所有entitiIDS ...

所以我的一个想法是构建一个Hbase表来保存数据（表DATA，以entityID作为主索引），以及2个“索引”表，一个以attributeName作为主键，另一个以值

每个索引表将保存 DATA 表的指针（entityID）列表。

这是一个合理的做法吗？或者是对 Hbase 概念的“滥用”？

在此博客中，作者说：

HBase 允许通过主数据库进行获取操作键并扫描（认为：光标）行范围。（如果你既有规模又有需要二级索引，不用担心 - Lucene 来救援！但这是另一篇文章了。）

您知道 Lucene 如何提供帮助吗？

——尤纳坦

原文

suppose that I have this RDBM table (Entity-attribute-value_model):

col1: entityID
col2: attributeName
col3: value

and I want to use HBase due to scaling issues.

I know that the only way to access Hbase table is using a primary key (cursor). you can get a cursor for a specific key, and iterate the rows one-by-one .

The issue is, that in my case, I want to be able to iterate on all 3 columns.
for example :

for a given an entityID I want to get all its attriutes and values
for a give attributeName and value I want to all the entitiIDS
...

so one idea I had is to build one Hbase table that will hold the data (table DATA, with entityID as primary index), and 2 "index" tables one with attributeName as a primary key, and the other one with value

each index table will hold a list of pointers (entityIDs) for the DATA table.

Is it a reasonable approach ? or is is an 'abuse' of Hbase concepts ?

In this blog the author say:

HBase allows get operations by primary
key and scans (think: cursor) over row
ranges. (If you have both scale and
need of secondary indexes, don’t worry
- Lucene to the rescue! But that’s another post.)

Do you know how Lucene can help ?

-- Yonatan

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

凉薄对峙 2024-07-17 16:10:34

二级索引确实对于 HBase 的许多潜在应用很有用，我相信开发人员实际上正在考虑它。查看 http://www.mail-archive.com /[电子邮件受保护]/msg04801.html。

与此同时，如果您的应用程序数据存储可以建模为星型模式（请参阅 http:// en.wikipedia.org/wiki/Star_schema）您可能想查看 Hypertable 针对二级索引类型需求提出的解决方案 http://markmail.org/message/rphm4q6cbar2ycgp

回复收藏 0 原文

清醇 2024-07-17 16:10:34

我建议使用两种不同的平面表：一种用于查找给定实体 ID 的属性+值，另一种用于查找给定属性+值的实体 ID。

表 1 如下所示：

entityID1 {
  attribute1: value1;
  attribute2: value2;
  ...
}

表 2：

attribute1_value1 {
  entityID1;
}
attribute2_value2 {
  entityID1;
}

I recommend having two different flat tables: one for looking up attributes+values given entityID, and one for looking up the entityID given attributes+values.

Table 1 would look like this:

entityID1 {
  attribute1: value1;
  attribute2: value2;
  ...
}

and Table 2:

attribute1_value1 {
  entityID1;
}
attribute2_value2 {
  entityID1;
}

回复收藏 0 原文

~没有更多了~

关于作者

浪推晚风

暂无简介

0 文章

0 评论

21 人气

关注发私信

友情链接

文江博客

如何设计Hbase架构？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

qq_FjTq5B

18273202778

WordPress小学生

〃温暖了心ぐ

迷乱花海

niuniu

友情链接

如何设计Hbase架构？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

qq_FjTq5B

18273202778

WordPress小学生

〃温暖了心ぐ

迷乱花海

niuniu

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。