从整理字符串中删除非数字,只保留数字

发布于 2025-02-10 12:56:50 字数 837 浏览 1 评论 0原文

需要语法或方法来从整理的VARCHAR列中删除任何非数字值,并替换为零填充的10位帐户号结果。该列有字母和特殊字符,但我需要删除这些字母,只保留数字。

我正在使用 lpad 函数函数code> ')将结果剩下的零

00
000123456M
councel_number_column
' 0
10
9999999

Need syntax or method to remove any non-numeric values from a collated varchar column and replace with zero padded 10-digit account number result. The column has letters and special characters, but I need to remove those and only keep the numbers.

I am using LPAD function LPAD(Account_Number_Column,10,'0') to left pad the result with zeros

Account_Number_Column
00#9999999
000123456M
N/A
Expected_Result
0009999999
0000123456
0000000000

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

妄司 2025-02-17 12:56:50

这应该做到这一点:

select lpad(regexp_replace(Account_Number_Column,'[^\\d]*'),10,0);

当然,您可能还需要将其他元素纳入数据质量(例如确保您的非填充结果是< = 10个字符等)。在您的用例中构成可接受的结果。

如果您有整理,我建议将其删除以作为RegexP函数,然后重新应用结果字符串。

这是一个示例,如果您有“ en-ci”的整理,则可以将其剥离以作为功能并重新申请结果:

select collate(lpad(regexp_replace(collate(Account_Number,''),'[^\\d]*'),10,'0'),'en-ci');

This should do it:

select lpad(regexp_replace(Account_Number_Column,'[^\\d]*'),10,0);

Of course, you may want to also incorporate other elements around data quality (like ensuring your non-padded result is <= 10 characters, etc). Whatever constitutes an acceptable result in your use case.

If you have collation, I would suggest removing it for the REGEXP function, and reapply to the resulting string.

Here's an example where if you had a collation of 'en-ci' you could strip it for the function and reapply for the result:

select collate(lpad(regexp_replace(collate(Account_Number,''),'[^\\d]*'),10,'0'),'en-ci');
氛圍 2025-02-17 12:56:50

一种方法是使用 regexp_substr_all 数字,然后将该数组与 array_to_to_string ,然后可以是LPAD至10

SELECT Account_Number_Column
    ,ARRAY_TO_STRING(REGEXP_SUBSTR_ALL(Account_Number_Column, '\\d'),'') as result_a
    ,LPAD(result_a,10,'0') as answer_a
FROM VALUES
    ('00#9999999'),
    ('000123456M'),
    ('N/A')
    t(Account_Number_Column);

给出:

counce_number_column结果_a woments_a
00#999999900999999900099999999
000123456M000123456 00001234560000123456
N/A''''''''''''00000000

. - 参考/ comLation.html#satering-using-utf-8-vs-vs-locale-Collat​​ion“ rel =” nofollow noreferrer“> colation> colation off strings通常不好因此,通过colate(Account_number_column,'utf8')

SELECT collate(raw, 'sp-upper') as Account_Number_Column
    ,lpad(regexp_replace(collate(Account_Number_Column,'utf8'),'[^\\d]*'),10,'0')
FROM VALUES
    ('00#9999999'),
    ('000123456M'),
    ('N/A')
    t(raw);

但这应该是吉姆的答案:

select lpad(regexp_replace(collate(Account_Number_Column,'utf8'),'[^\\d]*'),10,'0');

One method is it use REGEXP_SUBSTR_ALL to get the array of matching digits, and then turn that array into a string with ARRAY_TO_STRING, which can then be LPAD to 10

SELECT Account_Number_Column
    ,ARRAY_TO_STRING(REGEXP_SUBSTR_ALL(Account_Number_Column, '\\d'),'') as result_a
    ,LPAD(result_a,10,'0') as answer_a
FROM VALUES
    ('00#9999999'),
    ('000123456M'),
    ('N/A')
    t(Account_Number_Column);

gives:

ACCOUNT_NUMBER_COLUMNRESULT_AANSWER_A
00#99999990099999990009999999
000123456M0001234560000123456
N/A''0000000000

Striping the COLLATION off strings is normally bad, BUT given you only want digit's is should ok in this instance to do so via collate(Account_Number_Column,'utf8')

SELECT collate(raw, 'sp-upper') as Account_Number_Column
    ,lpad(regexp_replace(collate(Account_Number_Column,'utf8'),'[^\\d]*'),10,'0')
FROM VALUES
    ('00#9999999'),
    ('000123456M'),
    ('N/A')
    t(raw);

but it should be Jim's answer:

select lpad(regexp_replace(collate(Account_Number_Column,'utf8'),'[^\\d]*'),10,'0');
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文