如何获得雪花中包含行的数字

发布于 2025-02-11 15:38:39 字数 699 浏览 2 评论 0原文

我尝试过REGEXP，REGEXP_LIKE和LIKE，但不起作用

从B 中select *

regexp_like（col1，'\ d'），
其中regexp_like（col1，'[0-9]）
... .etc

我们有此表

col1
avr100000
adfdsgwr
20170910020359.761
企业
adf56ds76gwr
+093000
080000
adfdsgwr

  col1
  avr100000
  20170910020359.761
  adf56ds76gwr
  0+093000
  1080000

0 谢谢

原文

I have tried regexp, regexp_like and like but didn't work

select * from b

where regexp_like(col1, '\d')
where regexp_like(col1, '[0-9]')
....etc

we have this table

Col1
avr100000
adfdsgwr
20170910020359.761
Enterprise
adf56ds76gwr
0+093000
080000
adfdsgwr

output should be these 5 rows

  col1
  avr100000
  20170910020359.761
  adf56ds76gwr
  0+093000
  1080000

Thanks

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

千秋岁 2025-02-18 15:38:39

您可以在WHERE子句中使用REGEXP_INSTR，以查看是否在字符串中的任何位置找到一个数字：

create temp table b(col1 string);

insert into b (col1) values ('avr100000'), ('adfdsgwr'), 
('20170910020359.761'),
('Enterprise'),
('adf56ds76gwr'),
('0+093000'),
('080000'),
('adfdsgwr')
;

select col1 from b where regexp_instr(col1, '\\d') > 0;

我正在将答案更新到请注意，请注意Regexp_instr将执行约3.88比使用REGEXP_COUNT的时间快。

原因是REGEXP_INSTR将停止并报告其遇到的第一个数字的位置。相比之下，Regexp_count将继续检查字符串，直到达到其末端为止。如果我们只想知道字符串中是否存在一个数字，我们就可以在遇到第一个数字后立即停止。

如果这是一个小数据集，则无关紧要。对于大型数据集，快3.8倍的速度有很大的不同。这是一个小型测试安全带，显示了性能差异：

create or replace transient table RANDOM_STRINGS as
select RANDSTR(50, random()) as RANDSTR from table(generator (rowcount => 10000000));

alter session set use_cached_result = false;

-- Run these statements multiple times on an X-Small warehouse to test performance
-- Run both to warm the cache, then note the times after the initial runs to warm the cache

-- Average over 10 times with warm cache: 3.315s
select count(*) as ROWS_WITH_NUMBERS 
from RANDOM_STRINGS where regexp_count(randstr, '\\d') > 0;


-- Average over 10 times with warm cache: 0.8686s
select count(*) as ROWS_WITH_NUMBERS 
from RANDOM_STRINGS where regexp_instr(randstr, '\\d') > 0;

You can use regexp_instr in the where clause to see if it finds a digit anywhere in the string:

create temp table b(col1 string);

insert into b (col1) values ('avr100000'), ('adfdsgwr'), 
('20170910020359.761'),
('Enterprise'),
('adf56ds76gwr'),
('0+093000'),
('080000'),
('adfdsgwr')
;

select col1 from b where regexp_instr(col1, '\\d') > 0;

I'm updating my answer to note that regexp_instr is going to perform about 3.8 times faster than using regexp_count for this requirement.

The reason is that regexp_instr will stop and report the location of the first digit it encounters. In contrast, regexp_count will continue examining the string until it reaches its end. If we only want to know if a digit exists in a string, we can stop as soon as we encounter the first one.

If it is a small data set, this won't matter much. For large data sets, that 3.8 times faster makes a big difference. Here is a mini test harness that shows the performance difference:

create or replace transient table RANDOM_STRINGS as
select RANDSTR(50, random()) as RANDSTR from table(generator (rowcount => 10000000));

alter session set use_cached_result = false;

-- Run these statements multiple times on an X-Small warehouse to test performance
-- Run both to warm the cache, then note the times after the initial runs to warm the cache

-- Average over 10 times with warm cache: 3.315s
select count(*) as ROWS_WITH_NUMBERS 
from RANDOM_STRINGS where regexp_count(randstr, '\\d') > 0;


-- Average over 10 times with warm cache: 0.8686s
select count(*) as ROWS_WITH_NUMBERS 
from RANDOM_STRINGS where regexp_instr(randstr, '\\d') > 0;

回复收藏 0 原文

转瞬即逝 2025-02-18 15:38:39

一种方法是计算有多少个alpha令牌：

select column1 as input
    ,regexp_count(column1, '[A-Za-z]') as alpha_count
from values
    ('0-100000'),
    ('adfdsgwr'),
    ('20170910020359.761'),
    ('Enterprise'),
    ('adfdsgwr'),
    ('0+093000'),
    ('1-080000'),
    ('adfdsgwr')

输入	alpha_count
0-100000	0
ADFDSGWR	8
20170910020359.761	0
ENTERPRISE	10
ADFDSGWR	8
0+093000 0 1-080000	0
1-080000	0
ADFDSGWR	8

，从而排除那些不在的地方：

select column1 as input
from values
    ('0-100000'),
    ('adfdsgwr'),
    ('20170910020359.761'),
    ('Enterprise'),
    ('adfdsgwr'),
    ('0+093000'),
    ('1-080000'),
    ('adfdsgwr')
where regexp_count(column1, '[A-Za-z]') = 0

：INPUT 0：INPUT 0 ：

IT IT IT IT INPUT
0 INPUT 0 -100000
20170910020359.761
0+093000
1-080000

One method is to count how many alpha tokens there are:

select column1 as input
    ,regexp_count(column1, '[A-Za-z]') as alpha_count
from values
    ('0-100000'),
    ('adfdsgwr'),
    ('20170910020359.761'),
    ('Enterprise'),
    ('adfdsgwr'),
    ('0+093000'),
    ('1-080000'),
    ('adfdsgwr')

INPUT	ALPHA_COUNT
0-100000	0
adfdsgwr	8
20170910020359.761	0
Enterprise	10
adfdsgwr	8
0+093000	0
1-080000	0
adfdsgwr	8

and thus exclude those where it is not zero:

select column1 as input
from values
    ('0-100000'),
    ('adfdsgwr'),
    ('20170910020359.761'),
    ('Enterprise'),
    ('adfdsgwr'),
    ('0+093000'),
    ('1-080000'),
    ('adfdsgwr')
where regexp_count(column1, '[A-Za-z]') = 0

gives:

INPUT
0-100000
20170910020359.761
0+093000
1-080000

回复收藏 0 原文

红玫瑰 2025-02-18 15:38:39

您需要做的就是找到一个数值的1个或更多实例：

select
    column1 as input
from
values
    ('avr100000'),
    ('adfdsgwr'),
    ('20170910020359.761'),
    ('Enterprise'),
    ('adf56ds76gwr'),
    ('0+093000'),
    ('1-080000'),
    ('adfdsgwr')
where
    regexp_count (column1, '\\d') > 0;

结果：

avr100000
20170910020359.800
adf56ds76gwr
0+093000
1-080000

All you need to do is find 1 or more instances of a numeric value:

select
    column1 as input
from
values
    ('avr100000'),
    ('adfdsgwr'),
    ('20170910020359.761'),
    ('Enterprise'),
    ('adf56ds76gwr'),
    ('0+093000'),
    ('1-080000'),
    ('adfdsgwr')
where
    regexp_count (column1, '\\d') > 0;

Results:

avr100000
20170910020359.800
adf56ds76gwr
0+093000
1-080000

回复收藏 0 原文

~没有更多了~

关于作者

烟酒忠诚

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

如何获得雪花中包含行的数字

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

如何获得雪花中包含行的数字

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。