统计属于 MySQL 中某个类别的记录

发布于 2024-09-17 20:48:55 字数 992 浏览 3 评论 0原文

我一直在与一些 SQL 作斗争，但似乎无法理解它。

我有两张表，一张包含类别列表，另一张包含我的所有文章。

我想做的是找出每个类别有多少篇文章。

这是我到目前为止的 SQL

SELECT DISTINCT COUNT( po.post_Cat_ID ) AS Occurances, ca.cat_Title
FROM Posts po, Categories ca
WHERE ca.cat_ID = LEFT( po.post_Cat_ID, 2 )

我使用 LEFT 的原因是只获取主要类别，因为我列出了以下类别...例如，

Science = 01
Medicine = 0101
Sport = 02

关于阿司匹林的帖子因此将 cat_ID 设为 0101。（LEFT 然后会修剪0101、0102、0103 等至 01)。基本上我对子类别不感兴趣。

提前感谢

结果

SELECT DISTINCT COUNT( po.post_Cat_ID ) AS Occurances, ca.cat_Title
FROM Posts po, Categories ca
WHERE ca.cat_ID = LEFT( po.post_Cat_ID, 2 )
GROUP BY LEFT( po.post_Cat_ID, 2 )

ps感谢@nullpointer，目前有效，我会考虑重组对于其他读者，这里再次链接

http://mikehillyer.com/articles/managing-hierarchical-数据在 mysql/

原文

I've been battling with some SQL and can't seem to get my head around it.

I have two tables, one with the list of categories and another with all my articles.

What i'm trying to do is find how many articles are present for each category.

Here is the SQL I have so far

SELECT DISTINCT COUNT( po.post_Cat_ID ) AS Occurances, ca.cat_Title
FROM Posts po, Categories ca
WHERE ca.cat_ID = LEFT( po.post_Cat_ID, 2 )

The reason I use LEFT is to only get the main categories as I have listed categories as the following... for example

Science = 01
Medicine = 0101
Sport = 02

Posts on say asprin would therefore have a cat_ID as 0101. (LEFT would then trim 0101, 0102, 0103 etc to just 01). Basically im not interested in the subcategories.

Thanks in advance

Result

SELECT DISTINCT COUNT( po.post_Cat_ID ) AS Occurances, ca.cat_Title
FROM Posts po, Categories ca
WHERE ca.cat_ID = LEFT( po.post_Cat_ID, 2 )
GROUP BY LEFT( po.post_Cat_ID, 2 )

p.s. thanks @nullpointer, it works for the moment, i'll look into restructuring
for other readers heres the link again

http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

云雾 2024-09-24 20:48:55

向“类别”添加一列，其中给出每个类别所属的主要类别（主要类别给出自己）。所以：

cat_id | main_cat_id | title
-------+-------------+---------
01     | 01          | Science
0101   | 01          | Medicine
02     | 02          | Sport

从cat_id = main_cat_id上选择找到主要类别；在 left.cat_id = right.main_cat_id 上连接回自身以查找子类别，然后连接到 cat_id = cat_id 上的帖子。按 left.cat_id 分组并通过 cat_id 和 count(*) 进行项目。

我在 PostgreSQL 8.4 中尝试过这个，但我不明白为什么这在 MySQL 中不起作用，因为查询非常基本。我的表：

create table categories(
  cat_id varchar(40) primary key,
  main_cat_id varchar(40) not null references categories,
  title varchar(40) not null
)

create table posts (
  post_id integer primary key,
  cat_id varchar(40) not null references categories,
  title varchar(40) not null
)

我的查询（按标题而不是 ID 分组）：

select m.title, count(*)
from categories m, categories c, posts p
where m.cat_id = c.main_cat_id
  and c.cat_id = p.cat_id
group by m.title

更新：我也尝试过使用字符串操作来实现此操作，正如 OP 所尝试的那样。查询（采用 PostgreSQL 接受的符合标准的 SQL，而不是 MySQL 的方言）是：

select m.title, count(*)
from categories m, posts p
where m.cat_id = substring(p.cat_id from 1 for 2)
group by m.title;

效果很好。我无法提供有关速度的有意义的比较，但是此查询计划确实看起来比双向连接的查询计划简单一些。

Add a column to Categories which gives the main category that each category is in (with main categories giving themselves). So:

cat_id | main_cat_id | title
-------+-------------+---------
01     | 01          | Science
0101   | 01          | Medicine
02     | 02          | Sport

Select from this on cat_id = main_cat_id to find main categories; join back onto itself on left.cat_id = right.main_cat_id to find the child categories, then onto posts on cat_id = cat_id. Group by left.cat_id and project over cat_id and count(*).

I tried this in PostgreSQL 8.4, and i don't see why this wouldn't work in MySQL, as the query is pretty basic. My tables:

create table categories(
  cat_id varchar(40) primary key,
  main_cat_id varchar(40) not null references categories,
  title varchar(40) not null
)

create table posts (
  post_id integer primary key,
  cat_id varchar(40) not null references categories,
  title varchar(40) not null
)

My query (grouping by title rather than ID):

select m.title, count(*)
from categories m, categories c, posts p
where m.cat_id = c.main_cat_id
  and c.cat_id = p.cat_id
group by m.title

UPDATE: I also had a shot at making this work with a string operation, as the OP tried. The query (in standard-compliant SQL as accepted by PostgreSQL, rather than MySQL's dialect) is:

select m.title, count(*)
from categories m, posts p
where m.cat_id = substring(p.cat_id from 1 for 2)
group by m.title;

Which works fine. I can't offer a meaningful comparison as to speed, but the query plan for this did look a bit simpler than that for the two-way join.

回复收藏 0 原文

撕心裂肺的伤痛 2024-09-24 20:48:55

让我建议您重新构建架构。这里你想要的是表示一个层次结构（类别），这对于关系数据库来说并不是很简单。两种常见的解决方案是邻接表和嵌套集。

邻接列表是更简单的树状结构。您将拥有一个categories 表，如下所示：

id  | name      | parent
------------------------
1   | Science   | null
2   | Sports    | null
3   | Medicine  | 1

不幸的是，该模型很难使用 SQL 来处理。相反，我们可以使用嵌套集合方法。这里每个节点都有 lft 和 rgt 值节点，它们将位于父节点的 lft 和 rgt 值之间。在您的示例中，您将拥有：

id  | name      | lft  | rgt  
-------------------------------
1   | Science   | 1    | 4    
2   | Sports    | 5    | 6 
3   | Medicine  | 2    | 3

因此，为了检索特定类别的计数，您可以简单地查询具有 lft 和 rgt 值的节点的计数介于您想要的类别之间。例如：

   SELECT COUNT(*) 
     FROM articles a
LEFT JOIN categories c ON a.category_id = c.id
    WHERE lft BETWEEN 1 AND 4 
      AND rgt BETWEEN 1 AND 4

假设您的 article 表如下所示：

id  | ... | category_id

对此进行了更详细的讨论：
http://mikehillyer.com/articles/managing-hierarchical-data-in -mysql/

我会提出另一个解决方案：使用标签而不是类别。您可以对给定文章使用多个标签，并简单地获取与特定标签匹配的所有文章的计数。这将更容易使用，并为您提供更大的灵活性。

为了实现这一点，您需要在文章和标签之间建立多对多关系，这通常是通过联结表实现的：

tags
id  | name

articles_tags # the junction table
article_id  | tag_id

要标记一篇文章，您只需将多个条目 INSERT 插入到 >articles_tags 表，包含正确的 article_id 和 tag_id。然后你可以像往常一样使用JOIN来获得你想要的。

Let me suggest you to restructure the schema instead. What you want here is to represent a hierarchical structure (categories), which is not really straightforward to do with relational databases. Two common solutions are the adjacency list and the nested set.

The adjacency list is more a straightforward tree-like structure. You'll have a categories table like:

id  | name      | parent
------------------------
1   | Science   | null
2   | Sports    | null
3   | Medicine  | 1

Unfortunately this model is hard to work with using SQL. Instead, we can the nested set approach. Here every node has lft and rgt values node which will be between the parent's lft and rgt values. In your example you'll have:

id  | name      | lft  | rgt  
-------------------------------
1   | Science   | 1    | 4    
2   | Sports    | 5    | 6 
3   | Medicine  | 2    | 3

So in order to retrieve a count for a certain category, you can simply query the count of nodes that have a lft and rgt value in between the category you want. For example:

   SELECT COUNT(*) 
     FROM articles a
LEFT JOIN categories c ON a.category_id = c.id
    WHERE lft BETWEEN 1 AND 4 
      AND rgt BETWEEN 1 AND 4

Assuming your article table looks like:

id  | ... | category_id

This is discussed in more detail at:
http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/

I'll propose another solution: use tags rather than categories. You can use multiple tags for a given article and simply get the count of all articles matching a certain tag. This will be a lot easier to work with and also give you a lot more flexibility.

To accomplish this, you'll need a many-to-many relationship between articles and tags, which is usually implemented with a junction table:

tags
id  | name

articles_tags # the junction table
article_id  | tag_id

To tag an article, you simply INSERT multiple entries into the articles_tags table with the correct article_id and tag_id. Then you can use JOINs as usual to get what you want.

回复收藏 0 原文

~没有更多了~