如何对忽略文章（“the”、“a”等）的 SQLite 查询进行排序？

发布于 2024-09-19 00:37:17 字数 402 浏览 8 评论 0原文

我正在使用 C# 显示从 SQLite 数据库调用的电影标题列表。目前，我正在使用一个自定义 ListBox 类，该类具有对文本进行排序的功能，从每个项目的开头删除单词“The”。然而，这似乎并不是最简单的方法，因为它从 SQLite 数据库调用，然后进行排序。我更愿意将其缩减为一步，希望直接从“SELECT”查询中的数据库进行排序。

我对此进行了一些搜索，并找到了一些建议，包括在数据库中创建额外的排序列。虽然这当然是可能的，但我想知道是否有任何更简单的选项不需要插入几乎相同的重复信息（特别是如果数据库变得更大）。我对 SQLite 还很陌生，但我读过一些有关创建可用于创建自定义排序的整理函数的内容。但是，我不确定这是否适合它，并且似乎找不到在 C# 中实现它的任何帮助。

希望有人能够分享一些指导。如果额外的排序列是最好的方法，那么这就是我要做的。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

忆梦 2024-09-26 00:37:33

创建一个虚拟列（可以在 C# 中实现的函数的结果）并在此虚拟列上进行排序。该函数可以将“The”移至末尾，如“三个火枪手，The”中的那样，或者丢弃“The”，无论您想要它做什么。

回复收藏 0 原文

稀香 2024-09-26 00:37:32

您可以尝试在标题。然后，您将能够对标题中的任何单词进行快速搜索，而无需进行大量额外的工作。例如，用户查询好坏丑可能会产生“The Good, the Bad and the Ugly”作为其第一个结果。所有这些的额外成本一般约为文本本身长度的四分之一，但对于您的数据集来说可能会更多，因为标题不是完整的英文文本。您还需要花时间构建这些额外的索引 - 您不想在实时系统的主数据集上构建它们（显然） - 但这不应该是太大的问题。

回复收藏 0 原文

最美的太阳 2024-09-26 00:37:31

在 C# 代码中

如果您想在 C# 中执行此操作，请使用 LINQ 为您执行排序。我已经在 PasteBin 上发布了完整示例。这将使您能够：

避免在数据库中重复数据
像往常一样利用数据库索引，无论哪个 RDBMS
在配置文件中放入干扰词，从而减少修改列表时的停机时间/重建/重新部署
确保解决方案是中更具可读性

DropDownList1.DataSource = myBooks.OrderBy(n => ReplaceNoise(n.Title))

public string ReplaceNoise(string input)
{
     string[] noise = new string[] { "the", "an", "a" };

     //surely this could be LINQ'd 
     foreach (string n in noise)
     {
         if (input.ToLower().StartsWith(n))
         {
             return input.Substring(n.Length).Trim();
         }
     }
     return input;
}

在您的客户端代码在您的 SQLite 语句中

，简单地用 order by 中的空格替换干扰词怎么样？这是一个丑陋的第一步，但强烈考虑使用一个新列来存储该值以用于排序目的。

ORDER BY REPLACE(REPLACE([title],'the',''), 'a', '')

不可否认，当你最终得到这样的结果时，这会变得很难看：

REPLACE(REPLACE(REPLACE(REPLACE([title],'The ',''),'a',''),'of',''),'by','')

Within C# Code

If you wanted to do this within C#, use LINQ to do the ordering for you. I've posted a full sample on PasteBin. This will allow you to:

avoid duplicating data in your database
take advantage of DB indexes as you normally would, no matter which RDBMS
put in noise words in a config file, thereby reducing downtime/rebuild/redeploy when modifying the list
ensure a solution is more readable in your client code

DropDownList1.DataSource = myBooks.OrderBy(n => ReplaceNoise(n.Title))

public string ReplaceNoise(string input)
{
     string[] noise = new string[] { "the", "an", "a" };

     //surely this could be LINQ'd 
     foreach (string n in noise)
     {
         if (input.ToLower().StartsWith(n))
         {
             return input.Substring(n.Length).Trim();
         }
     }
     return input;
}

Within your SQLite statement

How about simply replacing the noise words with blanks in the order by? It's an ugly first step, but strongly consider a new column to store this value for sorting purposes.

ORDER BY REPLACE(REPLACE([title],'the',''), 'a', '')

Admittedly, this gets ugly when you end up with this:

REPLACE(REPLACE(REPLACE(REPLACE([title],'The ',''),'a',''),'of',''),'by','')

回复收藏 0 原文

没有伤那来痛 2024-09-26 00:37:28

您可以将每个标题存储为两部分：标题 和前缀。

使用 SQLite，您可以通过 || 组合 2 个字符串值运算符也称为连接运算符。

下面是一个示例：

SELECT prefix || ' ' || title FROM movies ORDER BY title

如果前缀为空，因此前面没有空格，您还可以使用 ltrim：

SELECT ltrim(prefix || ' ' || title) FROM movies ORDER BY title

另一种选择是将前缀存储在标题末尾。例如，在很多电影商店，您会看到类似的内容：

三个火枪手

You could store each title in 2 parts: title and prefix.

With SQLite you can combine 2 string values via the || operator also known as the concatenate operator.

Here's an example:

SELECT prefix || ' ' || title FROM movies ORDER BY title

You can also use ltrim in case prefix is empty, so you don't have a space at the front:

SELECT ltrim(prefix || ' ' || title) FROM movies ORDER BY title

Another alternative is to store the prefix at the end of the title. For example at a lot of movie stores you will see something like:

Three Musketeers, The

回复收藏 0 原文

黑凤梨 2024-09-26 00:37:26

这是解决方案：

ORDER BY (CASE 
    WHEN sortTitle LIKE 'the %' THEN substr(sortTitle,5) 
    WHEN sortTitle LIKE 'a %' THEN substr(sortTitle,3) 
    WHEN sortTitle LIKE 'an %' THEN substr(sortTitle,4) 
    ELSE sortTitle END)

Here is the solution:

ORDER BY (CASE 
    WHEN sortTitle LIKE 'the %' THEN substr(sortTitle,5) 
    WHEN sortTitle LIKE 'a %' THEN substr(sortTitle,3) 
    WHEN sortTitle LIKE 'an %' THEN substr(sortTitle,4) 
    ELSE sortTitle END)

回复收藏 0 原文