DB2 区分大小写

发布于 2024-09-26 20:20:18 字数 1316 浏览 4 评论 0原文

我很难使我的 DB2 (AS/400) 查询不区分大小写。

例如:

SELECT *
FROM NameTable
WHERE LastName = 'smith'

不会返回任何结果,但以下返回 1000 个结果:

SELECT *
FROM NameTable
WHERE LastName = 'Smith'

我读过将 SortSequence/SortType 放入连接字符串中,但没有运气......有人有这方面的经验吗?

编辑:

这是存储过程:

BEGIN
DECLARE CR CURSOR FOR
SELECT  T . ID ,
    T . LASTNAME ,
    T . FIRSTNAME ,
    T . MIDDLENAME ,
    T . STREETNAME || ' ' || T . ADDRESS2 || ' ' || T . CITY || ' ' || T . STATE || ' ' || T . ZIPCODE AS ADDRESS ,
    T . GENDER ,
    T . DOB ,
    T . SSN ,
    T . OTHERINFO ,
    T . APPLICATION
FROM
    ( SELECT R . * , ROW_NUMBER ( ) OVER ( ) AS ROW_NUM
    FROM CPSAB32.VW_MYVIEW
    WHERE R . LASTNAME = IFNULL ( @LASTNAME , LASTNAME )
    AND R . FIRSTNAME = IFNULL ( @FIRSTNAME , FIRSTNAME )
    AND R . MIDDLENAME = IFNULL ( @MIDDLENAME , MIDDLENAME )
    AND R . DOB = IFNULL ( @DOB , DOB )
    AND R . STREETNAME = IFNULL ( @STREETNAME , STREETNAME )
    AND R . CITY = IFNULL ( @CITY , CITY )
    AND R . STATE = IFNULL ( @STATE , STATE )
    AND R . ZIPCODE = IFNULL ( @ZIPCODE , ZIPCODE )
    AND R . SSN = IFNULL ( @SSN , SSN )
    FETCH FIRST 500 ROWS ONLY )
AS T
WHERE ROW_NUM <= @MAXRECORDS
OPTIMIZE FOR 500 ROW ;

OPEN CR ;
RETURN ;

I'm having great difficultly making my DB2 (AS/400) queries case insensitive.

For example:

SELECT *
FROM NameTable
WHERE LastName = 'smith'

Will return no results, but the following returns 1000's of results:

SELECT *
FROM NameTable
WHERE LastName = 'Smith'

I've read of putting SortSequence/SortType into your connection string but have had no luck... anyone have exepierence with this?

Edit:

Here's the stored procedure:

BEGIN
DECLARE CR CURSOR FOR
SELECT  T . ID ,
    T . LASTNAME ,
    T . FIRSTNAME ,
    T . MIDDLENAME ,
    T . STREETNAME || ' ' || T . ADDRESS2 || ' ' || T . CITY || ' ' || T . STATE || ' ' || T . ZIPCODE AS ADDRESS ,
    T . GENDER ,
    T . DOB ,
    T . SSN ,
    T . OTHERINFO ,
    T . APPLICATION
FROM
    ( SELECT R . * , ROW_NUMBER ( ) OVER ( ) AS ROW_NUM
    FROM CPSAB32.VW_MYVIEW
    WHERE R . LASTNAME = IFNULL ( @LASTNAME , LASTNAME )
    AND R . FIRSTNAME = IFNULL ( @FIRSTNAME , FIRSTNAME )
    AND R . MIDDLENAME = IFNULL ( @MIDDLENAME , MIDDLENAME )
    AND R . DOB = IFNULL ( @DOB , DOB )
    AND R . STREETNAME = IFNULL ( @STREETNAME , STREETNAME )
    AND R . CITY = IFNULL ( @CITY , CITY )
    AND R . STATE = IFNULL ( @STATE , STATE )
    AND R . ZIPCODE = IFNULL ( @ZIPCODE , ZIPCODE )
    AND R . SSN = IFNULL ( @SSN , SSN )
    FETCH FIRST 500 ROWS ONLY )
AS T
WHERE ROW_NUM <= @MAXRECORDS
OPTIMIZE FOR 500 ROW ;

OPEN CR ;
RETURN ;

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

清君侧 2024-10-03 20:20:18

为什么不这样做:

WHERE lower(LastName) = 'smith'

如果您担心性能(即查询不使用索引),请记住 DB2 有函数索引,您可以阅读 此处。因此本质上,您可以在 upper(LastName) 上创建索引。

编辑
要执行我在评论中讨论的调试技术,您可以执行以下操作:

create table log (msg varchar(100, dt date);

然后在 SP 中,您可以将消息插入到此表中以进行调试:

insert into log (msg, dt) select 'inside the SP', current_date from sysibm.sysdummy1;

然后在 SP 运行后,您可以从此日志表中选择以查看哪些内容发生了。

Why not do this:

WHERE lower(LastName) = 'smith'

If you're worried about performance (i.e. the query not using an index), keep in mind that DB2 has function indexes, which you can read about here. So essentially, you can create an index on upper(LastName).

EDIT
To do the debugging technique I discussed in the comments, you could do something like this:

create table log (msg varchar(100, dt date);

Then in your SP, you can insert messages to this table for debugging purposes:

insert into log (msg, dt) select 'inside the SP', current_date from sysibm.sysdummy1;

Then after the SP runs, you can select from this log table to see what happened.

吾家有女初长成 2024-10-03 20:20:18

如果您希望在过程中不区分大小写,请尝试在其中使用此选项:

SET OPTION SRTSEQ = *LANGIDSHR ;

您还应该创建一个索引来支持它的性能。当您将 *LANGIDSHR 作为连接属性时创建索引,然后共享权重索引应该可用于以后的作业。 (有多种方法可以使适当的设置生效。)

*LANGIDSHR 与您的作业的语言 ID 相关。字符集中可能被视为“等于”的字符(例如“A”和“a”或“ü”和“u”)应被赋予相同的权重(共享),以便一起选择。

If you want case-insensitive in your procedure, try using this option in it:

SET OPTION SRTSEQ = *LANGIDSHR ;

You should also create an index to support it for performance. Create the index when you have *LANGIDSHR as a connection attribute, and the shared-weight index should then be available to later jobs. (There are various ways to get the appropriate setting into effect.)

*LANGIDSHR relates to the language-ID for your jobs. Characters in the character set that might be considered as "equals", such as 'A' and 'a' or 'ü' and 'u', should be given equal weights (shared) and so select together.

素衣风尘叹 2024-10-03 20:20:18

当我想要不区分大小写的搜索时,我做了类似的事情。我使用了UPPER(mtfield) = 'SEARCHSTRING'。我知道这有效。

I did something similar when I wanted a case insensitive search. I used UPPER(mtfield) = 'SEARCHSTRING'. I know this works.

丑丑阿 2024-10-03 20:20:18

请参阅:https://stackoverflow.com/a/47181640/5507619

数据库设置

您可以设置一个数据库配置设置在 数据库创建。但它是基于 unicode 的。

CREATE DATABASE yourDB USING COLLATE UCA500R1_S1

默认的 Unicode 排序算法由 UCA500R1 关键字实现,没有任何属性。由于默认 UCA 无法同时包含 Unicode 支持的每种语言的整理顺序,因此可以指定可选属性来自定义 UCA 排序。属性由下划线 (_) 字符分隔。 UCA500R1 关键字和任何属性形成 UCA 排序规则名称。

强度属性确定在整理或比较文本字符串时是否考虑重音或大小写。在没有大小写或重音的书写系统中,强度属性控制着类似的重要功能。
可能的值为:一级 (1)、二级 (2)、三级 (3)、四级 (4) 和同一性 (I)。忽略:

  • 重音和大小写,使用主要强度级别
  • 仅限案例,使用次要强度级别
  • 无论重音还是大小写,都使用第三强度级别

几乎所有字符都可以通过前三个强度级别来区分,因此在大多数语言环境中,默认强度属性设置为第三级。但是,如果将“替代”属性(如下所述)设置为“移位”,则可以使用四进制强度级别来打破空白字符、标点符号和符号之间的联系,否则这些字符将被忽略。身份强度级别用于区分相似的字符,例如数学粗体小 A 字符 (U+1D41A) 和数学斜体小 A 字符 (U+1D44E)。

将“强度”属性设置为更高级别会减慢文本字符串比较速度并增加排序键的长度。
示例:

  • UCA500R1_S1 将整理“role”=“Role”=“rôle”
  • UCA500R1_S2 将整理“role”=“Role”< “角色”
  • UCA500R1_S3 将整理“角色”< “角色”< “角色”

这对我有用。正如您所看到的,..._S2 也忽略大小写。

使用 较新的标准版本,它应该如下所示:

CREATE DATABASE yourDB USING COLLATE CLDR181_S1

排序规则关键字
UCA400R1 = Unicode 标准 4.0 = CLDR 版本 1.2
UCA500R1 = Unicode 标准 5.0 = CLDR 版本 1.5.1
CLDR181 = Unicode 标准 5.2 = CLDR 版本 1.8.1

如果您的数据库已创建,则应该有一种方法更改设置

CALL SYSPROC.ADMIN_CMD( 'UPDATE DB CFG USING DB_COLLNAME UCA500R1_S1 ' );

我确实在执行此操作时遇到问题,但据我所知它应该有效。

生成的表行

其他选项例如 生成大写行

CREATE TABLE t (
   id          INTEGER  NOT NULL  PRIMARY KEY,
   str         VARCHAR(500),
   ucase_str   VARCHAR(500)  GENERATED ALWAYS AS ( UPPER(str) )
)@

INSERT INTO t(id, str)
VALUES ( 1, 'Some String' )@

SELECT * FROM t@

ID          STR                                  UCASE_STR
----------- ------------------------------------ ------------------------------------
          1 Some String                          SOME STRING

  1 record(s) selected.

See: https://stackoverflow.com/a/47181640/5507619

Database setting

There is a database config setting you can set at database creation. It's based on unicode, though.

CREATE DATABASE yourDB USING COLLATE UCA500R1_S1

The default Unicode Collation Algorithm is implemented by the UCA500R1 keyword without any attributes. Since the default UCA cannot simultaneously encompass the collating sequence of every language supported by Unicode, optional attributes can be specified to customize the UCA ordering. The attributes are separated by the underscore (_) character. The UCA500R1 keyword and any attributes form a UCA collation name.

The Strength attribute determines whether accent or case is taken into account when collating or comparing text strings. In writing systems without case or accent, the Strength attribute controls similarly important features.
The possible values are: primary (1), secondary (2), tertiary (3), quaternary (4), and identity (I). To ignore:

  • accent and case, use the primary strength level
  • case only, use the secondary strength level
  • neither accent nor case, use the tertiary strength level

Almost all characters can be distinguished by the first three strength levels, therefore in most locales the default Strength attribute is set at the tertiary level. However if the Alternate attribute (described below) is set to shifted, then the quaternary strength level can be used to break ties among white space characters, punctuation marks, and symbols that would otherwise be ignored. The identity strength level is used to distinguish among similar characters, such as the MATHEMATICAL BOLD SMALL A character (U+1D41A) and the MATHEMATICAL ITALIC SMALL A character (U+1D44E).

Setting the Strength attribute to higher level will slow down text string comparisons and increase the length of the sort keys.
Examples:

  • UCA500R1_S1 will collate "role" = "Role" = "rôle"
  • UCA500R1_S2 will collate "role" = "Role" < "rôle"
  • UCA500R1_S3 will collate "role" < "Role" < "rôle"

This worked for me. As you can see, ..._S2 ignores case, too.

Using a newer standard version, it should look like this:

CREATE DATABASE yourDB USING COLLATE CLDR181_S1

Collation keywords:
UCA400R1 = Unicode Standard 4.0 = CLDR version 1.2
UCA500R1 = Unicode Standard 5.0 = CLDR version 1.5.1
CLDR181 = Unicode Standard 5.2 = CLDR version 1.8.1

If your database is already created, there is supposed to be a way to change the setting.

CALL SYSPROC.ADMIN_CMD( 'UPDATE DB CFG USING DB_COLLNAME UCA500R1_S1 ' );

I do have problems executing this, but for all I know it is supposed to work.

Generated table row

Other options are e.g. generating a upper case row:

CREATE TABLE t (
   id          INTEGER  NOT NULL  PRIMARY KEY,
   str         VARCHAR(500),
   ucase_str   VARCHAR(500)  GENERATED ALWAYS AS ( UPPER(str) )
)@

INSERT INTO t(id, str)
VALUES ( 1, 'Some String' )@

SELECT * FROM t@

ID          STR                                  UCASE_STR
----------- ------------------------------------ ------------------------------------
          1 Some String                          SOME STRING

  1 record(s) selected.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文