SQL Server:一个让我烦恼的分组问题
十年来我一直在使用 SQL Server,而这种分组(或分区,或排名...我不确定答案是什么!)让我感到困惑。感觉这也应该是一件容易的事。我概括一下我的问题:
假设我有 3 名员工(不用担心他们辞职或其他什么事情......总是有 3 名),并且我每月都会跟进如何分配他们的工资。
Month Employee PercentOfTotal
--------------------------------
1 Alice 25%
1 Barbara 65%
1 Claire 10%
2 Alice 25%
2 Barbara 50%
2 Claire 25%
3 Alice 25%
3 Barbara 65%
3 Claire 10%
正如您所看到的,我在第 1 个月和第 3 个月向他们支付了相同的百分比,但在第 2 个月,我给了 Alice 相同的 25%,但 Barbara 得到了 50%,Claire 得到了 25%。
我想知道的是我曾经给出的所有不同的分布。在这种情况下,会有两个 - 一个用于第 1 个月和第 3 个月,一个用于第 2 个月。
我希望结果看起来像这样(注意:ID、序列器或其他什么,并不重要)
ID Employee PercentOfTotal
--------------------------------
X Alice 25%
X Barbara 65%
X Claire 10%
Y Alice 25%
Y Barbara 50%
Y Claire 25%
看起来很容易,对吧?我被难住了!有人有一个优雅的解决方案吗?我只是在写这个问题时整理了这个解决方案,这似乎有效,但我想知道是否有更好的方法。或者也许是我能学到一些东西的不同方式。
WITH temp_ids (Month)
AS
(
SELECT DISTINCT MIN(Month)
FROM employees_paid
GROUP BY PercentOfTotal
)
SELECT EMP.Month, EMP.Employee, EMP.PercentOfTotal
FROM employees_paid EMP
JOIN temp_ids IDS ON EMP.Month = IDS.Month
GROUP BY EMP.Month, EMP.Employee, EMP.PercentOfTotal
谢谢大家! -瑞奇
I've been working with SQL Server for the better part of a decade, and this grouping (or partitioning, or ranking...I'm not sure what the answer is!) one has me stumped. Feels like it should be an easy one, too. I'll generalize my problem:
Let's say I have 3 employees (don't worry about them quitting or anything...there's always 3), and I keep up with how I distribute their salaries on a monthly basis.
Month Employee PercentOfTotal
--------------------------------
1 Alice 25%
1 Barbara 65%
1 Claire 10%
2 Alice 25%
2 Barbara 50%
2 Claire 25%
3 Alice 25%
3 Barbara 65%
3 Claire 10%
As you can see, I've paid them the same percent in Months 1 and 3, but in Month 2, I've given Alice the same 25%, but Barbara got 50% and Claire got 25%.
What I want to know is all the distinct distributions I've ever given. In this case there would be two -- one for months 1 and 3, and one for month 2.
I'd expect the results to look something like this (NOTE: the ID, or sequencer, or whatever, doesn't matter)
ID Employee PercentOfTotal
--------------------------------
X Alice 25%
X Barbara 65%
X Claire 10%
Y Alice 25%
Y Barbara 50%
Y Claire 25%
Seems easy, right? I'm stumped! Anyone have an elegant solution? I just put together this solution while writing this question, which seems to work, but I'm wondering if there's a better way. Or maybe a different way from which I'll learn something.
WITH temp_ids (Month)
AS
(
SELECT DISTINCT MIN(Month)
FROM employees_paid
GROUP BY PercentOfTotal
)
SELECT EMP.Month, EMP.Employee, EMP.PercentOfTotal
FROM employees_paid EMP
JOIN temp_ids IDS ON EMP.Month = IDS.Month
GROUP BY EMP.Month, EMP.Employee, EMP.PercentOfTotal
Thanks y'all!
-Ricky
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
这会以与您要求的格式稍有不同的格式提供答案:
结果:
如果您愿意,可以使用 UNPIVOT 将此结果集转换为您要求的形式。
结果:
This gives you an answer in a slightly different format than you requested:
Result:
If you want to, you can use UNPIVOT to turn this result set into the form you asked for.
Result:
您想要的是让每个月的分布充当您希望在其他月份找到的值的签名或模式。目前尚不清楚的是,价值流向的员工是否与百分比细分一样重要。例如,Alice=65%、Barbara=25%、Claire=10% 与示例中的第 3 个月相同吗?在我的例子中,我认为它不会相同。与 Martin Smith 的解决方案类似,我通过将每个百分比乘以 10 来找到签名。这假定所有百分比值都小于 1。例如,如果某人拥有 110% 的百分比,就会给该解决方案带来问题。
What you want is for each month's distribution to act as a signature or pattern of values which you would then want to find in other months. What is not clear is whether the employee to whom the value went is as important as the break down of percentages. For example, would Alice=65%, Barbara=25%, Claire=10% be the same as the Month 3 in your example? In my example, I presumed that it would not be the same. Similar to Martin Smith's solution, I find the signatures by multiplying each percentage by 10. This presumes that all percentage values are less than one. If someone could have a percentage of 110% for example, that would create problems for this solution.
我假设性能不会很好(由于子查询)
> JOIN 中的确保仅采用一组匹配,即如果 Month1 条目 = Month3 条目,我们仅获得 Month3-Month1 条目组合,而不是 Month1-Month3、Month3-Month1 和 Month3-Month3。
I'm assuming performance won't be great (cause of the subquery)
The > in the JOIN ensures that only one set of matches is taken i.e. if a Month1 entry = Month3 entry, we get only the Month3-Month1 entry combination instead of Month1-Month3, Month3-Month1 and Month3-Month3.
如果我正确理解了您的意思,那么对于通用解决方案,我认为您需要将整个组连接在一起 - 例如生成
Alice:0.25、Barbara:0.50、Claire:0.25
。然后选择不同的组,这样就可以执行以下操作(相当笨拙)。If I have understood you correctly then, for a general solution, I think you would need to concatenate the whole group together - e.g. to produce
Alice:0.25, Barbara:0.50, Claire:0.25
. Then select the distinct groups so something like the following would do it (rather clunkily).我认为它不起作用。在这里,我添加了另外两组(分别为月份 = 4 和 5),我认为它们是不同的,但结果是相同的,即仅月份 = 1 和 2:
I don't think it does work. Here I've added a further two groups (month = 4 and 5 respectively) which I would consider to be distinct yet the result is the same i.e. month = 1 and 2 only: