Django:额外字段的比较
简短的问题: Django 中有没有一种方法可以根据某些字段的字母顺序以不区分大小写的方式查找下一行?
长问题:我在数据库中有一些单词,以及它们的详细视图。我希望能够按字母顺序浏览单词。所以我需要按字母顺序找出前一个和下一个单词的id。现在我所做的如下(原始是存储单词名称的字段):
class Word(models.Model):
original = models.CharField(max_length=50)
...
def neighbours(self):
"""
Returns the words adjacent to a given word, in alphabetical order
"""
previous_words = Word.objects.filter(
original__lt=self.original).order_by('-original')
next_words = Word.objects.filter(
original__gt=self.original).order_by('original')
previous = previous_words[0] if len(previous_words) else None
next = next_words[0] if len(next_words) else None
return previous, next
问题是这会进行区分大小写的比较,因此 Foo
出现在 bar< 之前/code>,这不是我想要的。为了避免这个问题,在另一个视图中 - 我列出了所有单词,我使用了一个自定义模型管理器,它添加了一个额外的字段,就像这样
class CaseInsensitiveManager(models.Manager):
def get_query_set(self):
"""
Also adds an extra 'lower' field which is useful for ordering
"""
return super(CaseInsensitiveManager, self).get_query_set().extra(
select={'lower': 'lower(original)'})
,并且在 Word 的定义中添加了
objects = models.Manager()
alpha = CaseInsensitiveManager()
这样的方式,我可以执行类似的查询
Word.alpha.all().order_by('lower')
并获取所有单词均按字母顺序排列,无论大小写。但我不能这样做
class Word(models.Model):
original = models.CharField(max_length=50)
...
objects = models.Manager()
alpha = CaseInsensitiveManager()
def neighbours(self):
previous_words = Word.objects.filter(
lower__lt=self.lower()).order_by('-lower')
next_words = Word.objects.filter(
lower__gt=self.lower()).order_by('lower')
previous = previous_words[0] if len(previous_words) else None
next = next_words[0] if len(next_words) else None
return previous, next
确实Django不会接受基于额外字段的字段查找 。那么,我应该做什么(除了编写自定义 SQL 之外)?
额外问题:我至少看到我正在做的事情存在更多问题。首先,我不确定性能。我假设在定义 previous_words
和 next_words
时根本不会执行任何查询,并且在定义 previous
时将在数据库中进行唯一的查找和next
,产生一个或多或少的查询这
SELECT Word.original, ..., lower(Word.original) AS lower
WHERE lower < `foo`
ORDER BY lower DESC
LIMIT 1
是对的吗?或者我正在做一些会降低数据库速度的事情?我对 Django ORM 的内部工作原理了解不够详细。
第二个问题是我实际上必须应对不同语言的单词。鉴于我知道每个单词的语言,有没有办法让它们按字母顺序排列,即使它们有非 ASCII 字符。例如,我想要按此顺序排列 méchant
、moche
,但我得到的是 moche
、méchant
。
Short question: Is there a way in Django to find the next row, based on the alphabetical order of some field, in a case-insensitive way?
Long question: I have some words in the database, and a detail view for them. I would like to be able to browse the words in alphabetical order. So I need to find out the id of the previous and next word in alphabetical order. Right now what I do is the following (original is the field that stores the name of the word):
class Word(models.Model):
original = models.CharField(max_length=50)
...
def neighbours(self):
"""
Returns the words adjacent to a given word, in alphabetical order
"""
previous_words = Word.objects.filter(
original__lt=self.original).order_by('-original')
next_words = Word.objects.filter(
original__gt=self.original).order_by('original')
previous = previous_words[0] if len(previous_words) else None
next = next_words[0] if len(next_words) else None
return previous, next
The problem is that this does a case-sensitive comparison, so Foo
appears before bar
, which is not what I want. To avoid this problem, in another view - where I list all words, I have made use of a custom model manager which adds an extra field, like this
class CaseInsensitiveManager(models.Manager):
def get_query_set(self):
"""
Also adds an extra 'lower' field which is useful for ordering
"""
return super(CaseInsensitiveManager, self).get_query_set().extra(
select={'lower': 'lower(original)'})
and in the definition of Word I add
objects = models.Manager()
alpha = CaseInsensitiveManager()
In this way I can do queries like
Word.alpha.all().order_by('lower')
and get all words in alphabetical order, regardless of the case. But I cannot do
class Word(models.Model):
original = models.CharField(max_length=50)
...
objects = models.Manager()
alpha = CaseInsensitiveManager()
def neighbours(self):
previous_words = Word.objects.filter(
lower__lt=self.lower()).order_by('-lower')
next_words = Word.objects.filter(
lower__gt=self.lower()).order_by('lower')
previous = previous_words[0] if len(previous_words) else None
next = next_words[0] if len(next_words) else None
return previous, next
Indeed Django will not accept field lookups based on extra fields. So, what am I supposed to do (short of writing custom SQL)?
Bonus questions: I see at least to more problems in what I am doing. First, I'm not sure about performance. I assume that no queries at all are performed when I define previous_words
and next_words
, and the only lookup in the database will happen when I define previous
and next
, yielding a query which is more or less
SELECT Word.original, ..., lower(Word.original) AS lower
WHERE lower < `foo`
ORDER BY lower DESC
LIMIT 1
Is this right? Or am I doing something which will slow down the database too much? I don't know enough details about the inner workings of the Django ORM.
The second problem is that I actually have to cope with words in different languages. Given that I know the language for each word, is there a way to get them in alphabetical order even if they have non-ASCII characters. For instance I'd want to have méchant
, moche
in this order, but I get moche
, méchant
.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
数据库应该能够为您执行此排序,并且应该能够在没有“lower”函数的情况下执行此操作。
实际上您需要修复的是数据库排序规则和编码。
例如,如果您使用 mysql,则可以使用字符集 utf8 和排序规则 utf8_general_ci。
如果该排序规则不适合您,您可以根据您的需要和数据库尝试其他排序规则。但是在查询中使用额外的字段和函数是一种丑陋的解决方法,会降低应用程序的速度。
mysql 和 postgresql 中也有许多可用的排序规则选项:
http:// /dev.mysql.com/doc/refman/5.5/en/charset-mysql.html
http://stackoverflow.com/questions/1423378/postgresql-utf8-character-comparison
但这绝对是在数据库级别优化的好机会。
The database should be able to do this sorting for you, and it should be able to do so without the "lower" function.
Really what you need to fix is the database collation and encoding.
For example, if you are using mysql you could use the character set utf8 and collation utf8_general_ci
If that collation doesn't work for you, you can try other collations depending on your needs and database. But using an extra field and a function in the query is an ugly workaround that is going to slow the app down.
There are many collations options available in mysql and postgresql too:
http://dev.mysql.com/doc/refman/5.5/en/charset-mysql.html
http://stackoverflow.com/questions/1423378/postgresql-utf8-character-comparison
But this is definitely a good chance to optimise at the db level.