Django 中的字数统计查询

发布于 2024-10-02 20:41:50 字数 445 浏览 2 评论 0原文

给定一个同时具有布尔字段和 TextField 字段的模型,我想要执行一个查询来查找与某些条件匹配并且在 TextField 中包含超过“n”个单词的记录。这可能吗?例如:

class Item(models.Model):

    ...    
    notes = models.TextField(blank=True,)
    has_media = models.BooleanField(default=False)
    completed = models.BooleanField(default=False)
    ...

这很简单:

items = Item.objects.filter(completed=True,has_media=True)

但是如何过滤“注释”字段超过 25 个单词的记录子集?

Given a model with both Boolean and TextField fields, I want to do a query that finds records that match some criteria AND have more than "n" words in the TextField. Is this possible? e..g.:

class Item(models.Model):

    ...    
    notes = models.TextField(blank=True,)
    has_media = models.BooleanField(default=False)
    completed = models.BooleanField(default=False)
    ...

This is easy:

items = Item.objects.filter(completed=True,has_media=True)

but how can I filter for a subset of those records where the "notes" field has more than, say, 25 words?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

牵强ㄟ 2024-10-09 20:41:50

试试这个:

Item.objects.extra(where=["LENGTH(notes) - LENGTH(REPLACE(notes, ' ', ''))+1 > %s"], params=[25])

此代码使用 Django 的 extra 查询集方法添加自定义 WHERE 子句。 WHERE 子句中的计算基本上计算“空格”字符的出现次数,假设所有单词都以一个空格字符为前缀。在结果中加一表示第一个单词。

当然,这个计算只是真实字数的近似值,所以如果必须精确的话,我会用Python进行字数统计。

Try this:

Item.objects.extra(where=["LENGTH(notes) - LENGTH(REPLACE(notes, ' ', ''))+1 > %s"], params=[25])

This code uses Django's extra queryset method to add a custom WHERE clause. The calculation in the WHERE clause basically counts the occurances of the "space" character, assuming that all words are prefixed by exactly one space character. Adding one to the result accounts for the first word.

Of course, this calculation is only an approximation to the real word count, so if it has to be precise, I'd do the word count in Python.

何时共饮酒 2024-10-09 20:41:50

我不知道需要运行什么 SQL 才能让数据库完成这项工作,这正是我们想要的,但您可以对其进行猴子修补。

创建一个名为 wordcount 或其他名称的额外字段,然后扩展 save 方法并使其在保存模型之前计算注释中的所有单词。

循环很简单,而且这种数据非规范化仍然不会中断,因为 save 方法始终在保存时运行。

但可能有更好的方法,但如果一切都失败了,这就是我会做的。

I dont know what SQL need to be run in order for the DB to do the work, which is really what we want, but you can monkey-patch it.

Make an extra fields named wordcount or something, then extend the save method and make it count all the words in notes before saving the model.

The it is trivial to loop over and there is still no chance that this denormalization of data will break since the save method is always run on save.

But there might be a better way, but if all else fails, this is what I would do.

最佳男配角 2024-10-09 20:41:50

12.5年后我在这里给出我光荣的答案!抛开笑话不谈,我认为人们可能仍在寻找这个并想要一个更现代的解决方案。

您可以定义一个自定义函数并将其与注释一起使用,如下所示:

from django.db.models import IntegerField, Func

class WordCount(Func):
    function = 'CHAR_LENGTH'
    name = 'word_count'
    template = "(%(function)s(%(expressions)s) - CHAR_LENGTH(REPLACE(%(expressions)s, ' ', '')))"
    output_field = IntegerField()
Posts.objects.filter(likes__gte=200).annotate(text_word_count=WordCount("abstract"))

I am here, after 12.5 years to give my glorious answer! Jokes aside, I think people might still search for this and want a more modern solution.

You can define a custom function and use that with annotate like that:

from django.db.models import IntegerField, Func

class WordCount(Func):
    function = 'CHAR_LENGTH'
    name = 'word_count'
    template = "(%(function)s(%(expressions)s) - CHAR_LENGTH(REPLACE(%(expressions)s, ' ', '')))"
    output_field = IntegerField()
Posts.objects.filter(likes__gte=200).annotate(text_word_count=WordCount("abstract"))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文