The HBase shell will include whatever you have in ~/.irbrc, so you can put something like this in there (I'm no Ruby expert, improvements are welcome):
hbase(main):002:0> scan
ERROR: wrong number of arguments (0 for 1)
Here is some help for this command:
Scan a table; pass table name and optionally a dictionary of scanner
specifications. Scanner specifications may include one or more of:
TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, TIMESTAMP, MAXLENGTH,
or COLUMNS. If no columns are specified, all columns will be scanned.
To scan all members of a column family, leave the qualifier empty as in
'col_family:'.
Some examples:
hbase> scan '.META.'
hbase> scan '.META.', {COLUMNS => 'info:regioninfo'}
hbase> scan 't1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}
hbase> scan 't1', {FILTER => org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)}
hbase> scan 't1', {COLUMNS => 'c1', TIMERANGE => [1303668804, 1303668904]}
For experts, there is an additional option -- CACHE_BLOCKS -- which
switches block caching for the scanner on (true) or off (false). By
default it is enabled. Examples:
hbase> scan 't1', {COLUMNS => ['c1', 'c2'], CACHE_BLOCKS => false}
Use the FILTER param of scan, as shown in the usage help:
hbase(main):002:0> scan
ERROR: wrong number of arguments (0 for 1)
Here is some help for this command:
Scan a table; pass table name and optionally a dictionary of scanner
specifications. Scanner specifications may include one or more of:
TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, TIMESTAMP, MAXLENGTH,
or COLUMNS. If no columns are specified, all columns will be scanned.
To scan all members of a column family, leave the qualifier empty as in
'col_family:'.
Some examples:
hbase> scan '.META.'
hbase> scan '.META.', {COLUMNS => 'info:regioninfo'}
hbase> scan 't1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}
hbase> scan 't1', {FILTER => org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)}
hbase> scan 't1', {COLUMNS => 'c1', TIMERANGE => [1303668804, 1303668904]}
For experts, there is an additional option -- CACHE_BLOCKS -- which
switches block caching for the scanner on (true) or off (false). By
default it is enabled. Examples:
hbase> scan 't1', {COLUMNS => ['c1', 'c2'], CACHE_BLOCKS => false}
Scan scan = new Scan();
FilterList list = new FilterList(FilterList.Operator.MUST_PASS_ALL);
//in case you have multiple SingleColumnValueFilters,
you would want the row to pass MUST_PASS_ALL conditions
or MUST_PASS_ONE condition.
SingleColumnValueFilter filter_by_name = new SingleColumnValueFilter(
Bytes.toBytes("SOME COLUMN FAMILY" ),
Bytes.toBytes("SOME COLUMN NAME"),
CompareOp.EQUAL,
Bytes.toBytes("SOME VALUE"));
filter_by_name.setFilterIfMissing(true);
//if you don't want the rows that have the column missing.
Remember that adding the column filter doesn't mean that the
rows that don't have the column will not be put into the
result set. They will be, if you don't include this statement.
list.addFilter(filter_by_name);
scan.setFilter(list);
Scan scan = new Scan();
FilterList list = new FilterList(FilterList.Operator.MUST_PASS_ALL);
//in case you have multiple SingleColumnValueFilters,
you would want the row to pass MUST_PASS_ALL conditions
or MUST_PASS_ONE condition.
SingleColumnValueFilter filter_by_name = new SingleColumnValueFilter(
Bytes.toBytes("SOME COLUMN FAMILY" ),
Bytes.toBytes("SOME COLUMN NAME"),
CompareOp.EQUAL,
Bytes.toBytes("SOME VALUE"));
filter_by_name.setFilterIfMissing(true);
//if you don't want the rows that have the column missing.
Remember that adding the column filter doesn't mean that the
rows that don't have the column will not be put into the
result set. They will be, if you don't include this statement.
list.addFilter(filter_by_name);
scan.setFilter(list);
发布评论
评论(7)
试试这个。这有点丑陋,但对我有用。
HBase shell 将包含 ~/.irbrc 中的所有内容,因此您可以在其中放入类似的内容(我不是 Ruby 专家,欢迎改进):
然后您可以在 shell 中说:
Try this. It's kind of ugly, but it works for me.
The HBase shell will include whatever you have in ~/.irbrc, so you can put something like this in there (I'm no Ruby expert, improvements are welcome):
and then you can just say in the shell:
更多信息可以在此处找到。请注意,附加的
Filter Language.docx
文件中包含多个示例。More information can be found here. Note that multiple examples reside in the attached
Filter Language.docx
file.使用
scan
的FILTER参数,如使用帮助所示:Use the FILTER param of
scan
, as shown in the usage help:其中一个过滤器是Valuefilter,它可用于过滤所有列值。
hbase(主):067:0>扫描“虚拟表”,{FILTER => “ValueFilter(=,'二进制:2016-01-26')”}
binary 是过滤器中使用的比较器之一。您可以根据您想要执行的操作在过滤器中使用不同的比较器。
您可以参考以下网址:http://www.hadooptpoint.com/filters-in-hbase-shell/。
它提供了有关如何在 HBase Shell 中使用不同过滤器的很好的示例。
One of the filter is Valuefilter which can be used to filter all column values.
hbase(main):067:0> scan 'dummytable', {FILTER => "ValueFilter(=,'binary:2016-01-26')"}
binary is one of the comparators used within the filter. You can use different comparators within the filter based on what you want to do.
You can refer following url: http://www.hadooptpoint.com/filters-in-hbase-shell/.
It provides good examples on how to use different filters in HBase Shell.
在查询末尾添加 setFilterIfMissing(true)
Add setFilterIfMissing(true) at the end of query
您可以在
hbase
shell 中使用以下命令来过滤列数据:You can use the below command in
hbase
shell to filter the column data: