我有一个项目,我需要计算播放电影的次数,并根据其日期进行分类。
我能够计算每个电影的发生,并使用和 countif
使用订购。但是,我不能想出使用日期的公式。
另外,我必须检查错误,因为如果电影连续日期通过,则需要将其算作一个exibition。
I have a project where I need to count the number of times that a movie was played, categorizing if it was the first, second, third... time according to its date.
I was able to count the occurrences per movie and to order them using IF
and COUNTIF
. However, I can't come up with a formula that uses the date as well.
Also, I have to check for errors, because if the movie passes in a consecutive date, it needs to be counted just as one exibition.
data:image/s3,"s3://crabby-images/b1d9e/b1d9eadf7d391ed2f9814cf580d30fab870543f9" alt="rank"
发布评论
评论(1)
问题:我们在“描述”列中有电影列表。旁边是一列“日期”,引用了播放特定电影的日期。此列是按时间顺序排列的不是。目的是创建另一列,让我们称其为“ Desired_rank”,每部电影按时间顺序排列每个游戏日期。但是,有一个收获:每部电影连续的播放日期需要算作一个游戏日期。不允许首先重新排序行(这会使问题更难以解决)。
让我们以简化的例子。下面是一张包含电影A,B和C的桌子,上面有一堆日期(其中一些连续;每部电影都是独特的)。 nb这里的日期是分类的。这只是为了帮助查看逻辑。
常规排名将足够简单。我们可以使用:
iiuc,我们正在寻找“ desired_rank”,将连续的播放日期组合在一起(黑色边界内的项目)。
解决方案:上述约束使事情变得复杂,但是我想出了以下公式来解决此问题,并在“ Desired_rank”列中产生结果:
这是上面的相同数据,但是现在随机排序。仍然可以按需要工作:
让我解释该公式的实际功能。
让
包装器在这里使用包装器将名称分配给中间计算结果。这提高了可读性。让我们按顺序浏览所有变量:电影,[@Description]
:1st,第二等电影。日期,[@date]
:第1,第二等。日期。排序,排序(filter([date],[description] = movie),1)
:在这里,过滤器
用于获取每个电影日期的数组。 使用param1
(=上升)对此数组进行排序上升顺序。即电影“match_date,match(日期,排序,0)
:我们使用匹配
匹配分类数组中关联的日期。因此,对于电影“ a”带有日期“ 4-jun-22”(=44716
),我们得到3
。f_diff,ifError(index(排序,序列(排(排序),2))) - 排序,0)
:在这里我们计算排序数组中的元素之间的前瞻性差异。零件索引(排序,序列(排(排序),2))
移动排序的数组一个元素前进(请参阅index
和序列
)。即电影“ },{2; 3; 4; 5; 6}) - >{44714; 44716; 44718; 44719; #ref!}
(#ref!
发生了错误,因为对于最后一个元素,我们走出了界限(即排序的数组实际上,没有一个元素6
)。ifError(结果,0)
获取0
的最后一个元素。 - {44713; 44714; 44716; 44718; 44719} 结果{1; 2; 2; 1; 0}
。现在将拥有关联的1的
:true在这里44713
和44718
(“ 1-Jun-22”, Jun-22”)。match_date = 3
。因此,我们在这里要求从f_diff撤回行{1,2}。这使我们回来:{1; 2}
,含义:第一个日期是连续日期序列的一部分(第二个日期也是,但请记住:这是最后一个元素,所以我们不需要它)。接下来,sum( - (({1; 2} = 1)
- >sum( - ({true; false; false}))
- >> 1
(在双单元运算符(-
)上,请参见Eg 在这里)。之前是1个元素,该元素是连续日期序列的一部分,但不是最后一个元素(即:“ 1-jun-22”)。代码>要修复每个电影第一次日期将发生的计算错误(此处match_date = 1
,导致sequence(1-1)
,这是不可能的)我们选择 因为第一个日期显然不能具有事先的连续日期序列。0
, >)并减去我们刚刚计算的先前连续日期序列的数字(sum_dupl
)。因此,3-1
电影“让我们对此分解有一些视觉感:
因此,以突出显示的示例:电影“ b”与日期“ 7-Jun-22”。 “常规_rank”将是
5
。d_fill
让我们:{1; 1; 1; 3; 1; 0}
。我们将所有1的
总结到此数组到达,但不包括“ 7-Jun-22”本身,因此{1; 1; 1; 3; 1}
并获得3
。也就是说:“ 1/2/6-Jun-22”是连续日期序列的所有三个部分(“ 3-Jun-22”是最后一个元素,因此不计数)。 “ desired_rank”将等于5-3
。希望一切都有意义。让我知道您是否在实施方面遇到任何困难。
Problem: we have a list of movies in a column "Description". Next to it is a column "Date", that references the date on which a particular movie was played. This column is not in chronological order. The goal is to create another column, let us call it "Desired_Rank", that ranks each play date in chronological order per movie. However, there's a catch: consecutive play dates per movie need to be counted as one play date. It is not allowed to re-order the rows first (which would make the problem much less difficult to solve).
Let's take a simplified example. Below is a table that contains movies A, B and C with a bunch of dates (some of them consecutive; all unique per movie). N.B. The dates here are sorted. This is just to help see the logic.
A regular rank for this would be simple enough. We could use:
IIUC, we are looking for the "Desired_Rank", which groups together consecutive play dates (items within the black borders).
Solution: the mentioned constraint complicates the matter, but I've come up with the following formula to solve this one, and produce the outcomes in column "Desired_Rank":
Here's the same data from above, but now randomly ordered. Still works as desired:
Let me explain what the formula actually does. The
LET
wrapper is used here to assign names to intermediate calculation results. This improves readability. Let's go over all the variables in order:movie,[@Description]
: 1st, 2nd etc. movie.date,[@Date]
: 1st, 2nd etc. date.sorted,SORT(FILTER([Date],[Description]=movie),,1)
: here,FILTER
is used to get an array with dates per movie.SORT
with param1
(= ascending) sorts this array in ascending order. I.e. for movie "A" we get{44713;44714;44716;44718;44719}
.match_date,MATCH(date,sorted,0)
: we useMATCH
to match for associated date within the sorted array. So for movie "A" with date "4-Jun-22" (=44716
) we get3
.f_diff,IFERROR(INDEX(sorted,SEQUENCE(ROWS(sorted),,2))-sorted,0)
: here we calculate the forward looking difference between the elements in our sorted array. The partINDEX(sorted,SEQUENCE(ROWS(sorted),,2))
shifts the sorted array one element forward (seeINDEX
andSEQUENCE
). I.e. for movie "A" we are saying here: get me the entire length of rows from the sorted array, but start at2
:INDEX({44713;44714;44716;44718;44719},{2;3;4;5;6})
->{44714;44716;44718;44719;#REF!}
(the#REF!
error occurs because for the last element we are stepping out of bounds (i.e. the sorted array does not in fact have an element6
). Now we subtract the original sorted array, and wrap this inIFERROR(result,0)
to get a0
for the last element. For movie "A"{44714;44716;44718;44719;#REF!}-{44713;44714;44716;44718;44719}
results in{1;2;2;1;0}
. All dates in our sorted array that are part of a consecutive date sequence (except the last element of such a sequence) will now have associated1's
: true here for44713
and44718
("1-Jun-22", "6-Jun-22").sum_dupl,IFERROR(SUM(--(INDEX(f_diff,SEQUENCE(match_date-1))=1)),0)
: let's takeINDEX(f_diff,SEQUENCE(match_date-1))
first. Here we take our f_diff array and request only the part that runs up to, but does not include the position of our associated date. Recall thatmatch_date = 3
for movie "A" and date "4-Jun-22". So, we asking here to get back rows {1,2} from f_diff. This gets us back:{1;2}
, meaning: 1st date is part of a consecutive date sequence (2nd date is too, but remember: it's the last element, so we don't need it). Next,SUM(--({1;2}=1)
->SUM(--({TRUE;FALSE}))
->1
(on the double unary operator (--
) see e.g. here). What is this number telling us: movie "A" with date "4-Jun-22" is preceded by 1 element that is part of a consecutive date sequence, but not the last element (namely: "1-Jun-22"). Finally, we have to wrap the function inIFERROR(result,0)
to fix the calculation error that will occur for the first date of each movie (herematch_date = 1
, leading toSEQUENCE(1-1)
, which is impossible). We pick0
, since the first date obviously cannot have prior consecutive date sequences.match_date-sum_dupl
: final calculation. Here we simply take the regular rank (match_date
) and subtract the number for the prior consecutive date sequences that we've just calculated (sum_dupl
). So,3-1
for movie "A" and date "4-Jun-22".Let's just get some visual sense of this breakdown:
So, to take the highlighted example: movie "B" with date "7-Jun-22". The "Regular_Rank" would be
5
.d_fill
gets us:{1;1;3;1;0}
. We sum all1's
in this array up to, but not including "7-Jun-22" itself, so{1;1;3;1}
and get3
. That is: "1/2/6-Jun-22" are all three part of consecutive date sequences ("3-Jun-22" is a last element, so not counted). "Desired_Rank" will equal5-3
.Hope that all makes sense. Let me know if you experience any difficulties with implementation.