DB2/iSeries SQL 清理 CR/LF、制表符等

发布于 2024-12-06 22:54:52 字数 333 浏览 1 评论 0原文

我需要在一组超过 400k 的字符串记录中查找并清理换行符、回车符、制表符和“SUB”字符,但是这个 DB2 环境对我造成了影响。

我以为我可以做一些搜索并用 REPLACE() 和 CHR() 函数替换,但似乎 CHR() 在此系统上不可用(错误:*LIBL 类型 *N 中的 CHR 未找到)。使用 \t、\r、\n 等似乎也不起作用。字符可以位于字符串的中间或末尾。

DBMS = DB2
System = iSeries
Language = SQL
Encoding = Not sure, possibly EBCDIC

关于我可以用这个做什么的任何提示?

I need to find and clean up line breaks, carriage returns, tabs and "SUB"-characters in a set of 400k+ string records, but this DB2 environment is taking a toll on me.

Thought I could do some search and replacing with the REPLACE() and CHR() functions, but it seems CHR() is not available on this system (Error: CHR in *LIBL type *N not found). Working with \t, \r, \n etc doesn't seem to be working either. The chars can be in the middle of strings or at the end of them.

DBMS = DB2
System = iSeries
Language = SQL
Encoding = Not sure, possibly EBCDIC

Any hints on what I can do with this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

謸气贵蔟 2024-12-13 22:54:52

我用这个 SQL 来查找 x'25' 和 x'0D':

SELECT 
     <field>
    , LOCATE(x'0D', <field>) AS "0D" 
    , LOCATE(x'25', <field>) AS "25" 
    , length(trim(<field>)) AS "Length"
FROM <file> 
WHERE   LOCATE(x'25', <field>) > 0 
    OR  LOCATE(x'0D', <field>) > 0 

我用这个 SQL 来替换它们:

UPDATE <file> 
SET <field> = REPLACE(REPLACE(<field>, x'0D', ' '), x'25', ' ')
WHERE   LOCATE(x'25', <field>) > 0 
    OR  LOCATE(x'0D', <field>) > 0 

I used this SQL to find x'25' and x'0D':

SELECT 
     <field>
    , LOCATE(x'0D', <field>) AS "0D" 
    , LOCATE(x'25', <field>) AS "25" 
    , length(trim(<field>)) AS "Length"
FROM <file> 
WHERE   LOCATE(x'25', <field>) > 0 
    OR  LOCATE(x'0D', <field>) > 0 

And I used this SQL to replace them:

UPDATE <file> 
SET <field> = REPLACE(REPLACE(<field>, x'0D', ' '), x'25', ' ')
WHERE   LOCATE(x'25', <field>) > 0 
    OR  LOCATE(x'0D', <field>) > 0 
错爱 2024-12-13 22:54:52

如果您想清除特定字符,例如回车符 (EBCDIC x'0d') 和换行符 (EBCDIC x'25'),您应该找到 EBCDIC 中的翻译字符,然后使用 TRANSLATE() 函数将其替换为空格。

如果您只想删除无法显示的字符,请查找 x'40' 以下的任何内容。

If you want to clear up specific characters like carriage return (EBCDIC x'0d') and line feed (EBCDIC x'25') you should find the translated character in EBCDIC then use the TRANSLATE() function to replace them with space.

If you just want to remove undisplayable characters then look for anything under x'40'.

允世 2024-12-13 22:54:52

以下是将 X'41' 替换为 X'40' 的示例脚本。在我们商店中造成问题的一件事是:

UPDATE [yourfile] SET [yourfield] = TRANSLATE([yourfield], X'40', 
X'41') WHERE [yourfield] like '%' concat X'41' concat '%'    

如果您需要替换多个字符,请将“to”和“from”十六进制字符串扩展到 TRANSLATE 函数中所需的值。

Here is an sample script that replaces X'41' by X'40'. Something that was creating issues at our shop:

UPDATE [yourfile] SET [yourfield] = TRANSLATE([yourfield], X'40', 
X'41') WHERE [yourfield] like '%' concat X'41' concat '%'    

If you need to replace more than one character, extend the "to" and "from" hexadecimal strings to the values you need in the TRANSLATE function.

天暗了我发光 2024-12-13 22:54:52

尝试翻译或替换。

暴力方法涉及使用 POSITION 查找错误字符,然后在其前后使用 SUBSTR。连接两个子字符串(减去不需要的字符)以重新形成列。

字符编码几乎可以肯定是 EBCDIC 字符集之一。根据表最初加载的方式,CR 可能是 x'0d',LF 可能是 x'15'、x'25'。找出答案的一个简单方法是进入绿屏并对桌子进行 DSPPFM。按 F10,然后按 F11 查看原始的十六进制(上/下)格式的表。

Try TRANSLATE or REPLACE.

The brute force method involves using POSITION to find the errant character, then SUBSTR before and after it. CONCAT the two substrings (less the undesirable character) to re-form the column.

The character encoding is almost certainly one of the EBCDIC character sets. Depending on how the table got loaded in the first place, the CR may be x'0d' and the LF x'15', x'25'. An easy way to find out is to get to a green screen and do a DSPPFM against the table. Press F10 then F11 to view the table is raw, hexadecimal (over/under) format.

掩饰不了的爱 2024-12-13 22:54:52

有关可用功能的详细信息,请参阅
DB2 for i5/OS SQL 参考< /a>.

For details on the available functions see the
DB2 for i5/OS SQL Reference.

卖梦商人 2024-12-13 22:54:52

也许TRANSLATE() 函数可以满足您的需求。

    TRANSLATE( data, tochars, fromchars )

...其中 fromchars 是您不需要的字符集,tochars 是您希望替换的相应字符。您可能必须以十六进制格式写出此内容,如 x'nnnnnn...' 并且您需要知道您正在使用的字符集。
在表上使用 DSPFFD 命令应显示字段的 CCSID。

Perhaps the TRANSLATE() function will serve your needs.

    TRANSLATE( data, tochars, fromchars )

...where fromchars is the set of characters you don't want, and tochars is the corresponding characters you want them replaced with. You may have to write this out in hex format, as x'nnnnnn...' and you will need to know what character set you are working with.
Using the DSPFFD command on your table should show the CCSID of your fields.

本王不退位尔等都是臣 2024-12-13 22:54:52

我们费尽心思从平面文件中替换新的行字符和回车符。

最后我们使用下面的sql来解决这个问题。

REPLACE(REPLACE(COLUMN_NAME, CHR(13), ''), CHR(10), '')

尝试一下

CR = CHR(13)
LF = CHR(10) 

we struggled a lot to replace the new line char and carriage return from flat file.

Finally we used below sql to sort the issue.

REPLACE(REPLACE(COLUMN_NAME, CHR(13), ''), CHR(10), '')

Try it out

CR = CHR(13)
LF = CHR(10) 
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文