当前位置：文江博客话题详情

获取hbase中的所有家庭

发布于 2024-11-04 13:06:51 字数 110 浏览 3 评论 0原文

我有一个 hbase 表

行：单词，族：日期

我想获取日期“d”处所有单词的扫描仪，我该怎么做？所以我不想指定行值。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

花桑 2024-11-11 13:06:58

您的问题不清楚您要从哪里获取扫描仪，因此我将像来自 HBase 命令行一样对待它。我使用 thrift 库与 hbase 交互，CLI 命令的转换非常明显。我认为它们也能很好地转换到您要使用扫描仪的任何其他界面。

要获取特定列族的所有行，您可以使用以下命令

scan 'table_name', {COLUMNS => 'col_family:'}

对于您的情况（减去“table_name”，因为我不知道），它看起来像这样

scan 'yourTable', {COLUMNS => 'd:'}

，它将返回列族中的所有行d。

如果您还想指定从什么 RowKeys 开始，它将类似于

scan 'yourTable', {COLUMNS => 'd:', STARTROW => 'word'}

That command will START at the row key word 并获取该点之后的所有行。如果您想将其限制为仅 RowKey word，则还必须添加 STOPROW。 STOPROW 不包含在结果中。因此，您不能扫描“yourTable”，{COLUMNS => 'd：'，STARTROW => '单词', STOPROW => 'word'} 因为这不会返回任何内容。
指定 STOPROW 需要了解 RowKey 值的一些知识。我不知道你的价值观，所以很难给出一个好的例子。我经常做的是使用下一个字符（在 ASCII 集中）作为起始行的最后一个字符。在你的例子中，我会尝试，

scan 'yourTable', {COLUMNS => 'd:', STARTROW => 'word', STOPROW => 'wore'}

我不会保证这会一直有效，但在大多数情况下它可能会有效。也许所有情况，我只是还没有解决。 :)

希望有帮助。

HBase shell 命令的一个很好的资源是 http://wiki.apache.org/hadoop/Hbase/Shell 。

Your question isn't clear where you are trying to get a scanner from, so I'm going to treat it like it's from the HBase command line. I've used the thrift library to interact with hbase and the CLI commands translate pretty obviously to that. I assume they will also translate well to any other interface you are getting a scanner for.

To get all the rows for a specific Column Family, you would use the following command

scan 'table_name', {COLUMNS => 'col_family:'}

For your case (minus 'table_name' 'cause I don't know that) it would look something like

scan 'yourTable', {COLUMNS => 'd:'}

That will return all rows in the column family d.

If you also want to specify what RowKeys to start at, it will look something like

scan 'yourTable', {COLUMNS => 'd:', STARTROW => 'word'}

That command will START at the row key word and get all rows after that point. If you want to limit it to just the RowKey word, you will also have to add the STOPROW. The STOPROW isn't included in the results. So you CAN'T do scan 'yourTable', {COLUMNS => 'd:', STARTROW => 'word', STOPROW => 'word'} as that will return nothing.
Specifying a STOPROW takes some knowledge of the RowKey values. I don't know your values, so it's hard to give a good example. What I often do is use the next character (in the ASCII set) for the last character of my start row. In your example I'd try

scan 'yourTable', {COLUMNS => 'd:', STARTROW => 'word', STOPROW => 'wore'}

I'm not going to promise this will work all the time, but it is likely to work in most cases. Perhaps all cases, I just haven't worked it out. :)

Hopefully that helps.

A good resource for HBase shell commands is http://wiki.apache.org/hadoop/Hbase/Shell.

回复收藏 0 原文

一场春暖 2024-11-11 13:06:58

我假设您正在谈论使用 Java API 的 scan 命令

如果我正确理解您的结构，您目前无法在不进行全表扫描的情况下按日期检索单词。 - 你可以 setFilter 在扫描上，但它仍然必须转到每一行来检查

您是否没有指定，但我猜每个单词可能会出现在许多日期中（如果您的意思是您有每个日期都有一个家庭然后注意不建议超过 2-3 个家庭）

如果你想要一种相对有效的存储方式，我建议你将结构更改为
关键字 Word0xDate 并将日期存储在 TimeStamp 中，然后将一些 1 字节值作为数据（这样就会存在一行）
在存储方面，它将与您当前的解决方案相同（加上 2 个字节，您可以通过缩短系列和限定符名称来抵消），并且您将能够扫描时间戳或时间戳范围（setTimestamp 和 setTimeRange 分别）这会更有效，因为 hbase 将跳过存储不相关时间戳的文件）

回复收藏 0 原文

白云悠悠 2024-11-11 13:06:58

试试这个：

     HTable t = new HTable(conf,"YourROW");
     ResultScanner scanner = t.getScanner(new Scan());    
     for (Result rr = scanner.next(); rr != null; rr = scanner.next()) 
     {
           if (rr.getValue("YourFamily" , "YourQualifier").equals(Bytes.toBytes("d"))
           {
                Get g = new Get(key);
                Result row = t.get(g);
                System.out.println("" + row.toString()); //print all data from this row
           }
     }

Try this:

     HTable t = new HTable(conf,"YourROW");
     ResultScanner scanner = t.getScanner(new Scan());    
     for (Result rr = scanner.next(); rr != null; rr = scanner.next()) 
     {
           if (rr.getValue("YourFamily" , "YourQualifier").equals(Bytes.toBytes("d"))
           {
                Get g = new Get(key);
                Result row = t.get(g);
                System.out.println("" + row.toString()); //print all data from this row
           }
     }

回复收藏 0 原文

~没有更多了~