在 Android SQLite 中使用 COLLATE - LIKE 语句中忽略区域设置

发布于 2024-09-14 03:45:42 字数 465 浏览 8 评论 0 原文

在 Android 中创建 SQLite 数据库时,我设置了数据库区域设置 - db.setLocale(new Locale("cz_CZ"))。这是捷克的语言环境。

SELECT 语句起作用并考虑区域设置,例如:

SELECT * from table WHERE name='sctzy' COLLATE LOCALIZED 

将找到条目“ščťžý”。

但使用 LIKE 将失败:

SELECT * from table WHERE name LIKE '%sctzy%' COLLATE LOCALIZED 

不会返回任何行。

顺便提一句。 Android 中没有 java.text.Normalized 类。我想我可以用规范化的文本创建第二列,去掉特殊字符,这将用于搜索 - 但我缺少一个如何规范化字符串的类或方法。

When creating my SQLite database in Android I set the database locale - db.setLocale(new Locale("cz_CZ")). This is a Czech locale.

A SELECT statement works and takes the locale into account, for example:

SELECT * from table WHERE name='sctzy' COLLATE LOCALIZED 

Will find the entry 'ščťžý'.

But using LIKE will fail:

SELECT * from table WHERE name LIKE '%sctzy%' COLLATE LOCALIZED 

No row is returned.

BTW. There is no java.text.Normalized class in Android. I thought I could make a second column with a normalized text, stripped of special characters, which would be used for searching - but I am missing a class or way how to normalize the String.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

童话里做英雄 2024-09-21 03:45:42

您是否看过LIKE 的 SQLite 文档?它提供了有关非 ASCII 字符和错误的信息。也许 Android 安装了旧版本的 SQLite,这是一个问题。

不幸的是,我认为第二个标准化列可能是您的最佳选择。

Have you had a look at the SQLite documentation for LIKE? It has come information about non ASCII characters and a bug. Maybe Android has an older version of SQLite installed where this is a problem.

I think the second normalised column might be your best option unfortunately.

羁拥 2024-09-21 03:45:42

就在今天,我的任务与你完全相同。在我的情况下,创建额外的阴影列不是一种情况,因为我必须搜索多个列。于是我就想到了这样的解决方案,并在实际项目中进行了测试。就我而言,我仅处理小写字母,但您也可以使用大写字母扩展该功能。

db.setLocale(Locale("cz", "CZ"))
val query = "SELECT * FROM table WHERE name GLOB ${getExpr(str)} ORDER BY name COLLATE LOCALIZED ASC"

private fun getExpr(input: String) : String{
    var expr = ""
    for(lettter in input){
        expr += when(lettter){
            's','š' -> "[sš]"
            'a','á' -> "[aá]"
            'e','ě','é' -> "[eěé]"
            'i','í' -> "[ií]"
            'z','ž' -> "[zž]"
            'c','č' -> "[cč]"
            'y','ý' -> "[yý]"
            'r','ř' -> "[rř]"
            'u','ů','ú' -> "[uůú]"
            'o','ó' -> "[oó]"
            'n','ň' -> "[nň]"
            'd','ď' -> "[dď]"
            't','ť' -> "[tť]"
            else -> lettter
        }
     }
     return "'*${expr}*'"
}

Just today i had exactly the same task as you had. And in my situation making additional shadow columns is not a case because i'm having to search more than one column. So i came to the solution like this, which is tested in real project. In my case i'm handling only lower case letters but you can extend the function with upper case letters as well.

db.setLocale(Locale("cz", "CZ"))
val query = "SELECT * FROM table WHERE name GLOB ${getExpr(str)} ORDER BY name COLLATE LOCALIZED ASC"

private fun getExpr(input: String) : String{
    var expr = ""
    for(lettter in input){
        expr += when(lettter){
            's','š' -> "[sš]"
            'a','á' -> "[aá]"
            'e','ě','é' -> "[eěé]"
            'i','í' -> "[ií]"
            'z','ž' -> "[zž]"
            'c','č' -> "[cč]"
            'y','ý' -> "[yý]"
            'r','ř' -> "[rř]"
            'u','ů','ú' -> "[uůú]"
            'o','ó' -> "[oó]"
            'n','ň' -> "[nň]"
            'd','ď' -> "[dď]"
            't','ť' -> "[tť]"
            else -> lettter
        }
     }
     return "'*${expr}*'"
}
不奢求什么 2024-09-21 03:45:42

在 Android sqlite 中,LIKEGLOB 会忽略 COLLATE LOCALIZEDCOLLATE UNICODE (它们仅适用于 排序依据)。但是,正如 @asat 在他的答案中所解释的那样,您可以将 GLOB 与以下模式一起使用:将每个字母替换为该字母的所有可用替代项。在 Java 中:

public static String addTildeOptions(String searchText) {
    return searchText.toLowerCase()
                     .replaceAll("[aáàäâã]", "\\[aáàäâã\\]")
                     .replaceAll("[eéèëê]", "\\[eéèëê\\]")
                     .replaceAll("[iíìî]", "\\[iíìî\\]")
                     .replaceAll("[oóòöôõ]", "\\[oóòöôõ\\]")
                     .replaceAll("[uúùüû]", "\\[uúùüû\\]")
                     .replace("*", "[*]")
                     .replace("?", "[?]");
}

然后(当然,不是字面意思):

SELECT * from table WHERE lower(column) GLOB "*addTildeOptions(searchText)*"

这样,例如在西班牙语中,搜索 masmás 的用户将获得搜索转换转换成m[aáàäâã]s,返回两个结果。

值得注意的是,GLOB 忽略了 COLLATE NOCASE,这就是我在函数和查询中将所有内容都转换为小写的原因。另请注意,sqlite 中的 lower() 函数不适用于非 ASCII 字符 - 但同样,这些字符可能是您已经替换的字符!

该函数还将 GLOB 通配符 *? 替换为“转义”版本。

In Android sqlite, LIKE and GLOB ignore both COLLATE LOCALIZED and COLLATE UNICODE (they only work for ORDER BY). However, as @asat explains in his answer, you can use GLOB with a pattern that will replace each letter with all the available alternatives of that letter. In Java:

public static String addTildeOptions(String searchText) {
    return searchText.toLowerCase()
                     .replaceAll("[aáàäâã]", "\\[aáàäâã\\]")
                     .replaceAll("[eéèëê]", "\\[eéèëê\\]")
                     .replaceAll("[iíìî]", "\\[iíìî\\]")
                     .replaceAll("[oóòöôõ]", "\\[oóòöôõ\\]")
                     .replaceAll("[uúùüû]", "\\[uúùüû\\]")
                     .replace("*", "[*]")
                     .replace("?", "[?]");
}

And then (not literally like this, of course):

SELECT * from table WHERE lower(column) GLOB "*addTildeOptions(searchText)*"

This way, for example in Spanish, a user searching for either mas or más will get the search converted into m[aáàäâã]s, returning both results.

It is important to notice that GLOB ignores COLLATE NOCASE, that's why I converted everything to lower case both in the function and in the query. Notice also that the lower() function in sqlite doesn't work on non-ASCII characters - but again those are probably the ones that you are already replacing!

The function also replaces both GLOB wildcards, * and ?, with "escaped" versions.

慢慢从新开始 2024-09-21 03:45:42

创建第二个标准化列可用于绕过限制(如其他答案中简要提到的)。

这意味着在实践中,您必须创建第一个列的另一个(影子)列,其中存储固定大小写的相同数据(例如所有大写字符)。可以使用相同大小写的搜索值对此新列进行不区分大小写的查询(包括类似查询)。

如果第一列“a”包含

AAA
aaa
Bbb
äää
ééé

第二列 a_shadow 将包含相同的行

AAA
AAA
BBB
阿阿阿
ÉÉÉ

和您的原始查询(示例)“select a from mytable where a='äää'”
将替换为“从 mytable 中选择一个,其中 A='ää'”

您的代码需要更新,以在添加主要内容时填充转换后的影子内容。
如果列是在创建后添加的,或者您无法更改代码,则可能需要使用更新查询来转换现有值。示例:

UPDATE mytable SET a_shadow=UPPER(a);

Creating a second normalised column can be used to go around limitations (as mentioned briefly in other answers).

This means in practice that you have to create another (shadow) column of your first where the same data in a fixed case (e.g. all upper chars) is stored. Case insensitive queries (including like queries) can be made on this new column with search values in the same case.

If the first column "a" contains

AAA
aaa
Bbb
äää
ééé

The second column a_shadow would contain for the same rows

AAA
AAA
BBB
ÄÄÄ
ÉÉÉ

and your original query (example) "select a from mytable where a='äää'"
would be replaced with "select a from mytable where A='ÄÄÄ'"

Your code needs to be updated to fill the converted shadow content when adding the primary content.
If the column is added after creation or you cannot change the code existing values may need to be converted using an update query. Example:

UPDATE mytable SET a_shadow=UPPER(a);

情痴 2024-09-21 03:45:42

可能很耗时,但您可以像这里一样使用 java.text.Normalizer

转换符号、英文字母的重音字母

由于不属于Android的java子集,你可以尝试在java的代码中查找,例如Normalizer.java
此处找到Javadoc:

并复制项目中需要的代码。

希望它有效!

Might be time consuming, but you can use the java.text.Normalizer like here

Converting Symbols, Accent Letters to English Alphabet

As is not part of the java subset that Android, you may try to look for it at the code of java, such as Normalizer.java
With the Javadoc found here:

And copy the part of the code needed inside your project.

Hope it works!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文