DB2/iSeries SQL 清理 CR/LF、制表符等
我需要在一组超过 400k 的字符串记录中查找并清理换行符、回车符、制表符和“SUB”字符,但是这个 DB2 环境对我造成了影响。
我以为我可以做一些搜索并用 REPLACE() 和 CHR() 函数替换,但似乎 CHR() 在此系统上不可用(错误:*LIBL 类型 *N 中的 CHR 未找到)。使用 \t、\r、\n 等似乎也不起作用。字符可以位于字符串的中间或末尾。
DBMS = DB2
System = iSeries
Language = SQL
Encoding = Not sure, possibly EBCDIC
关于我可以用这个做什么的任何提示?
I need to find and clean up line breaks, carriage returns, tabs and "SUB"-characters in a set of 400k+ string records, but this DB2 environment is taking a toll on me.
Thought I could do some search and replacing with the REPLACE() and CHR() functions, but it seems CHR() is not available on this system (Error: CHR in *LIBL type *N not found). Working with \t, \r, \n etc doesn't seem to be working either. The chars can be in the middle of strings or at the end of them.
DBMS = DB2
System = iSeries
Language = SQL
Encoding = Not sure, possibly EBCDIC
Any hints on what I can do with this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
我用这个 SQL 来查找 x'25' 和 x'0D':
我用这个 SQL 来替换它们:
I used this SQL to find x'25' and x'0D':
And I used this SQL to replace them:
如果您想清除特定字符,例如回车符 (EBCDIC x'0d') 和换行符 (EBCDIC x'25'),您应该找到 EBCDIC 中的翻译字符,然后使用 TRANSLATE() 函数将其替换为空格。
如果您只想删除无法显示的字符,请查找 x'40' 以下的任何内容。
If you want to clear up specific characters like carriage return (EBCDIC x'0d') and line feed (EBCDIC x'25') you should find the translated character in EBCDIC then use the TRANSLATE() function to replace them with space.
If you just want to remove undisplayable characters then look for anything under x'40'.
以下是将
X'41'
替换为X'40'
的示例脚本。在我们商店中造成问题的一件事是:如果您需要替换多个字符,请将“to”和“from”十六进制字符串扩展到
TRANSLATE
函数中所需的值。Here is an sample script that replaces
X'41'
byX'40'
. Something that was creating issues at our shop:If you need to replace more than one character, extend the "to" and "from" hexadecimal strings to the values you need in the
TRANSLATE
function.尝试翻译或替换。
暴力方法涉及使用 POSITION 查找错误字符,然后在其前后使用 SUBSTR。连接两个子字符串(减去不需要的字符)以重新形成列。
字符编码几乎可以肯定是 EBCDIC 字符集之一。根据表最初加载的方式,CR 可能是 x'0d',LF 可能是 x'15'、x'25'。找出答案的一个简单方法是进入绿屏并对桌子进行 DSPPFM。按 F10,然后按 F11 查看原始的十六进制(上/下)格式的表。
Try TRANSLATE or REPLACE.
The brute force method involves using POSITION to find the errant character, then SUBSTR before and after it. CONCAT the two substrings (less the undesirable character) to re-form the column.
The character encoding is almost certainly one of the EBCDIC character sets. Depending on how the table got loaded in the first place, the CR may be x'0d' and the LF x'15', x'25'. An easy way to find out is to get to a green screen and do a DSPPFM against the table. Press F10 then F11 to view the table is raw, hexadecimal (over/under) format.
有关可用功能的详细信息,请参阅
DB2 for i5/OS SQL 参考< /a>.
For details on the available functions see the
DB2 for i5/OS SQL Reference.
也许
TRANSLATE()
函数可以满足您的需求。...其中 fromchars 是您不需要的字符集,tochars 是您希望替换的相应字符。您可能必须以十六进制格式写出此内容,如
x'nnnnnn...'
并且您需要知道您正在使用的字符集。在表上使用
DSPFFD
命令应显示字段的 CCSID。Perhaps the
TRANSLATE()
function will serve your needs....where fromchars is the set of characters you don't want, and tochars is the corresponding characters you want them replaced with. You may have to write this out in hex format, as
x'nnnnnn...'
and you will need to know what character set you are working with.Using the
DSPFFD
command on your table should show the CCSID of your fields.我们费尽心思从平面文件中替换新的行字符和回车符。
最后我们使用下面的sql来解决这个问题。
尝试一下
we struggled a lot to replace the new line char and carriage return from flat file.
Finally we used below sql to sort the issue.
Try it out