如何过滤掉 Teradata 文本字段中的非数字值？

发布于 2024-09-15 19:29:03 字数 724 浏览 10 评论 0原文

我有一个 Teradata 表，其中包含大约 1000 万条记录，该表将数字 id 字段存储为 varchar。我需要将该字段中的值传输到另一个表中的 bigint 列，但我不能简单地说cast(id_field as bigint)，因为我收到无效字符错误。查看这些值，我发现字符串中的任何位置都可能有一个字符，所以假设字符串是 varchar(18) 我可以像这样过滤掉无效行：

     where substr(id_field,1,1) not in (/*big,ugly array of non-numeric chars*/)
     and substr(id_field,2,1) not in (/*big,ugly array of non-numeric chars*/)

etc, etc...

然后强制转换就可以了，但这是不可行的从长远来看。它很慢，并且如果字符串有 18 个可能的字符，则会导致查询不可读。如何过滤掉在此字段中具有值且不会转换为 bigint 的行，而不单独检查非数字字符数组中的每个字符？

示例值是

   123abc464
   a2.3v65
   a_356087
   ........
   000000000
   BOB KNIGHT
   1235468099

不遵循特定模式的值，我只需要过滤掉包含任何非数字数据的值。 123456789 可以，但 123.abc_c3865 不行……

原文

oI have a teradata table with about 10 million records in it, that stores a numeric id field as a varchar. i need to transfer the values in this field to a bigint column in another table, but i can't simply say cast(id_field as bigint) because i get an invalid character error. looking through the values, i find that there could be a character at any position in the string, so let's say the string is varchar(18) i could filter out invalid rows like so :

     where substr(id_field,1,1) not in (/*big,ugly array of non-numeric chars*/)
     and substr(id_field,2,1) not in (/*big,ugly array of non-numeric chars*/)

etc, etc...

then the cast would work, but this is not feasible in the long run. it's slow and if the string has 18 possible characters, it makes the query unreadable. how can i filter out rows that have a value in this field that will not cast as a bigint without checking each character individually for an array of non-numeric characters?

example values would be

   123abc464
   a2.3v65
   a_356087
   ........
   000000000
   BOB KNIGHT
   1235468099

the values follow no specific patterns, I simply need to filter out the ones that contain ANY non-numeric data.
123456789 is okay but 123.abc_c3865 is not...

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

橪书 2024-09-22 19:29:03

从TD14 Teradata开始添加了一些功能，现在有多种方法，例如：

WHERE RTRIM(col, '0123456789') = ''

但最简单的方法是TO_NUMBER，对于坏数据返回NULL：

TO_NUMBER(col)

Starting with TD14 Teradata added some functions, now there are multiple ways, e.g.:

WHERE RTRIM(col, '0123456789') = ''

But the easiest way is TO_NUMBER, which returns NULL for bad data:

TO_NUMBER(col)

回复收藏 0 原文

少女情怀诗 2024-09-22 19:29:03

我曾经管理过的最好的方法是：

where char2hexint(upper(id_field)) = char2hexint(lower(id_field))

由于大写字符为小写字符提供了不同的十六进制值，这将确保您没有字母字符，但仍然会留下下划线、冒号等。如果这不能满足您的要求，您可能需要编写 UDF。

The best that I've ever managed is this:

where char2hexint(upper(id_field)) = char2hexint(lower(id_field))

Since upper case characters give a different hex value to lower case ones, this will ensure that you have no alphabetical characters, but will still leave you with underscores, colons and so forth. If this doesn't meet your requirements, you may need to write an UDF.

回复收藏 0 原文

风苍溪 2024-09-22 19:29:03

我们是否还可以尝试将字段中的值除以某个整数“如果除以则必须是一个数字，如果不是并抛出一些错误，则必须有一些字符......”猜测这会很快，就像数学一样涉及...

回复收藏 0 原文

神回复 2024-09-22 19:29:03

我在尝试从街道地址门牌号中排除字母字符时遇到了同样的问题。如果您不介意将所有数字连接在一起，则以下内容将有效......
它检查字符串的上部是否等于字符串的下部，如果是，则为数字，否则为空。

select cast(case when upper(substring('12E'from 1 for 1)) = lower(substring('12E'from 1 for 1)) then substring('12E'from 1 for 1) else null end ||
             case when upper(substring('12E'from 2 for 1)) = lower(substring('12E'from 2 for 1)) then substring('12E'from 2 for 1) else null end ||
             case when upper(substring('12E'from 3 for 1)) = lower(substring('12E'from 3 for 1)) then substring('12E'from 3 for 1) else null end ||
             case when upper(substring('12E'from 4 for 1)) = lower(substring('12E'from 4 for 1)) then substring('12E'from 4 for 1) else null end ||
             case when upper(substring('12E'from 5 for 1)) = lower(substring('12E'from 5 for 1)) then substring('12E'from 5 for 1) else null end ||
             case when upper(substring('12E'from 2 for 1)) = lower(substring('12E'from 2 for 1)) then substring('12E'from 2 for 1) else null end
             as integer)

I've faced the same issue to try to exclude alpha characters from street address house numbers. The following will work if you don't mind concatanating all the numeric numbers together......
It checks if the upper of a string equals the lower of the string, if so it's a number, if not it becomes null.

select cast(case when upper(substring('12E'from 1 for 1)) = lower(substring('12E'from 1 for 1)) then substring('12E'from 1 for 1) else null end ||
             case when upper(substring('12E'from 2 for 1)) = lower(substring('12E'from 2 for 1)) then substring('12E'from 2 for 1) else null end ||
             case when upper(substring('12E'from 3 for 1)) = lower(substring('12E'from 3 for 1)) then substring('12E'from 3 for 1) else null end ||
             case when upper(substring('12E'from 4 for 1)) = lower(substring('12E'from 4 for 1)) then substring('12E'from 4 for 1) else null end ||
             case when upper(substring('12E'from 5 for 1)) = lower(substring('12E'from 5 for 1)) then substring('12E'from 5 for 1) else null end ||
             case when upper(substring('12E'from 2 for 1)) = lower(substring('12E'from 2 for 1)) then substring('12E'from 2 for 1) else null end
             as integer)

回复收藏 0 原文

阳光①夏 2024-09-22 19:29:03

尝试使用此代码段

WHERE id_Field NOT LIKE '%[^0-9]%'

Try using this code segment

WHERE id_Field NOT LIKE '%[^0-9]%'

回复收藏 0 原文

や莫失莫忘 2024-09-22 19:29:03

我发现 lins314159 答案对于解决类似问题非常有帮助。它可能是一个旧线程，但就其价值而言，我使用了：

char2hexint(upper(id_field)) = char2hexint(lower(id_field)) AND substr(id_field,1,1) IN ('1' to '9 ')

成功将剩余的 VARCHAR 结果转换为 INT

回复收藏 0 原文

流殇 2024-09-22 19:29:03

SELECT customer_id
FROM t
WHERE UPPER(customer_id)(CASESPECIFIC) <>
      LOWER(customer_id)(CASESPECIFIC);

这对于检查数字字段中的值是否为非数字非常有效。

SELECT customer_id
FROM t
WHERE UPPER(customer_id)(CASESPECIFIC) <>
      LOWER(customer_id)(CASESPECIFIC);

This works perfectly fine to check whether the values in a numeric field is non-numeric.

回复收藏 0 原文

笔落惊风雨 2024-09-22 19:29:03

SELECT id_field
WHERE oTranslate(id_field, '0123456789','')<>'';

这对我来说效果很好！它显示任何包含非数字值的 id_field

SELECT id_field
WHERE oTranslate(id_field, '0123456789','')<>'';

This works well for me! It reveals any id_field containing a non-numeric value

回复收藏 0 原文

~没有更多了~

关于作者

靖瑶

暂无简介

文章

27 人气

关注发私信

友情链接

文江博客

如何过滤掉 Teradata 文本字段中的非数字值？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（8）

关于作者

相关话题

热门标签

推荐作者

lylex099819

yg

mb_PT8LkUS5

埋情葬爱

佚名

奢望

友情链接

如何过滤掉 Teradata 文本字段中的非数字值？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（8）

关于作者

相关话题

热门标签

推荐作者

lylex099819

yg

mb_PT8LkUS5

埋情葬爱

佚名

奢望

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。