用单个数值表示前 10 名列表的方法

发布于 2024-10-07 21:12:31 字数 349 浏览 11 评论 0原文

我有一个前 10 名列表，其中包含分数（最高分获胜）和时间戳。

时间戳用于平局得分的情况，在这种情况下，具有最低时间戳的平局得分获胜（第一个获得得分较高的人）。

排序数据集示例：

20 102906755
15 102910755
14 102890755
14 102890756
13 102890756

请注意得分 14 上的平局，时间戳较小的得分较高。

我需要以正确的顺序将分数和时间戳表示为单个 32 位值。

假设最高分是100万。

我通过减去第一个有效日期分数来减少时间戳值。

在 C 语言中如何实现这一点？

原文

I have a top 10 list with a score (highest score wins) and a timestamp.

The timestamp is used in the case of a tie score, in which case the tied score with the lowest timestamp wins (first person to achieve the score places higher).

Sorted data set example:

20 102906755
15 102910755
14 102890755
14 102890756
13 102890756

Notice the tie on score 14, the one with the smaller timestamp is placed higher.

I need to represent with proper ordering both the score and timestamp as a single 32bit value.

Assume the maximum score is 1 million.

I have reduced the timestamp value by subtracting the first valid date score.

How might this be accomplished in C?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

謌踐踏愛綪 2024-10-14 21:12:31

首先，如果您需要表示超过 2^32 个不同的分数与时间戳组合，那么游戏就结束了 - 无法完成。

考虑到这一点：

分数真正需要多少位？如果最大分数是 100 万，则需要 20 位，这样可以均匀分布，只剩下 12 位作为时间戳。这将非常严格，尤其如果可能有超过 2^12 = 4000 个列表条目与单个分数相关，尽管分数分布大概不是甚至。
时间戳真正需要多少位？
您可以丢弃时间戳中的最高有效位吗？例如，如果您知道所有时间都在 2001 年之后，那么您可以将时间戳作为 2001 年而不是 1970 年的基础，这会增加 1 位。 [编辑：你似乎已经改变了时间戳，它们不再像原来那样像 1970 年以来的秒一样]
你可以丢弃时间戳中不太重要的位吗？在您的示例数据中，您的时间戳仅相隔 1 秒，但这现实吗？如果有两行分数和时间戳都相等怎么办？想必这不是世界末日：如果 1 秒间隙是可能的，那么我想 0 秒间隙也是可能的。例如，如果将所有时间戳四舍五入到最接近的 32 秒，那么您可以获得 5 位，但代价是引入更多死联系。
对于时间戳，您可以使用每次分数进入时递增一次的值，而不是自纪元以来的实际时间（以秒为单位）吗？如果是这样，那么您可能会节省很多位。您能否为每个可能的分数使用不同的递增值，将示例数据转换为 (20, 0)、(15, 0)、(14, 0)、(14, 1)、(13, 0)？
你可以使用>32位的值吗？

如果其中任何一个的答案是好的，那么也许这是可能的。如果答案都很糟糕，那么根本不可能做你想做的事。

[编辑：考虑到下面的评论，你的新问题的答案是：

double value = (double) score - ((double)timestamp) / (((long long)1) << 33);

容易得多。一直到公元 2242 年都很好。

假设您的实现上的 double 是 64 位，这几乎是通用的。]

First thing is that if there are more than 2^32 different combinations of score with timestamp that you need to represent then it's game over - can't be done.

With that in mind:

how many bits do you really need for the score? If the max score is 1 million you need 20 bits for that, which allowing for an even distribution leaves only 12 for the timestamp. This is going to be very tight, especially if it's possible to have more than 2^12 = 4000 list entries tied on a single score, although score distribution presumably isn't even.
how many bits do you really need for the timestamp?
can you discard most-significant bits from the timestamp? For example, if you know that all times will be after 2001 then you could base the timestamp from 2001 instead of 1970, which gains you 1 bit. [Edit: you seem to have changed the timestamps, they no longer look like seconds since 1970 as they originally did]
can you discard less-significant bits from the timestamp? In your example data you have timestamps which are only 1s apart, but is this realistic? What if you have two lines where the score and the timestamp are both equal? Presumably it's not the end of the world: if a 1s gap is possible then I imagine a 0s gap is possible. Then for example, if you round all timestamps to the nearest 32s then you can gain 5 bits, at the cost of introducing more dead ties.
for the timestamp, can you use a value that's incremented once each time a score comes in, instead of an actual time in seconds since epoch? If so then you can potentially save a lot of bits. Can you use a different incrementing value for each possible score, transforming your example data into (20, 0), (15, 0), (14, 0), (14, 1), (13, 0)?
can you use a >32 bit value?

If the answers to any of those are good, then perhaps it's possible. If the answers are all bad, then it's simply not possible to do what you want.

[Edit: the answer to your new question, taking into account the comment below, is:

double value = (double) score - ((double)timestamp) / (((long long)1) << 33);

Much easier. It's good until 2242AD.

This assuming double is 64 bits on your implementation, which is almost universal.]

回复收藏 0 原文

爱的那么颓废 2024-10-14 21:12:31

假设您要使用无符号值，并且您的最大分数是 31。

您可以使用前 5 位作为分数，使用较低的 27 位作为时间戳。

请注意，这限制了两个值的限制，因此您必须考虑可能的值以及如果尝试使用该范围之外的值时将采取的措施。

存储...

composite = ~(score << 27) | timestamp;

并稍后获取值：

#define TIMESTAMP_BITS ((1 << 27) - 1)
score = composite >> 27;
timestamp = composite & TIMESTAMP_BITS;

请注意，您需要在使用分数和时间戳之前检查它们，并且可能应用掩码以确保值不会重叠。

编辑

Riderchap 提出了一个很好的观点。作为示例给出的时间戳太大，无法与 32 位整数一起使用。我的回答更多的是演示原则上如何做到这一点。

Let's say you were going to use unsigned values, and your maximum score was 31.

You could use the top 5 bits for the score, and the lower 27 for the timestamp.

Note that this constrains the limits of both values, so you must consider the possible values and what you will do if you try to use values outside of that range.

To store...

composite = ~(score << 27) | timestamp;

And to later get the values:

#define TIMESTAMP_BITS ((1 << 27) - 1)
score = composite >> 27;
timestamp = composite & TIMESTAMP_BITS;

Please note though that you want to check the score and timestamp before using them, and probably apply masks to be sure the values do not overlap.

Edit

Riderchap makes a good point. The timestamps you give as an example would be too large to use with a 32 bit integer. My answer is more of a demonstration of how to do this in principle.

回复收藏 0 原文

萌能量女王 2024-10-14 21:12:31

尽管您随后表示实际上有一个 double 可以用来存储信息，但我想我仍然可以分享一些关于使用 32 位整数执行此操作的想法。

首先，为了让这些数字按照分数顺序第一、时间顺序第二排序，您希望分数占据较高值的位置，而时间戳占据较低值的位置。为了将分数放在较高值的位置，我们必须选择将其乘以某个常数因子。无符号 32 位整数可以表示的最大数字是 4,294,967,295 ，我们的分数范围是 0 到 1,000,000。这给我们带来了 4,294 的乘数。

然后，我们就得到了时间戳占据的较低值位置 - 只需将其添加进去。这给了我们：

N = SCORE * 4294 + TIMESTAMP;

反向转换为：

SCORE = N / 4294;
TIMESTAMP = N % 4294;

但是，请注意，TIMESTAMP 允许的范围是 0 到4293 含。任何更高的值都会溢出到 N 中被 SCORE 占据的部分。如果 TIMESTAMP 只是一个秒数（从最早的分数从 4293 开始倒计时），这只能给我们从最早的分数到最新的分数的最大时间，刚刚超过 71 分钟 - 可能不够。

这只是对您可以用 32 位整数表示的不同存储桶数量的限制 - 为了能够使用此模型表示更多不同的时间戳，您需要减少可以表示的不同分数的数量。

但是，请注意，我们并不真正希望将时间戳作为时间戳 - 我们只是希望它作为分数的单调排序。作为替代方案，我们可以保留一个计数器。将计数器初始化为 4293。当新的分数出现时，将计数器的当前值存储为“时间戳”，然后递减计数器。这将允许在计数器用完之前存储 4294 个不同的高分。

作为进一步的细化，我们可以注意到我们只关心同一 SCORE 值内的排序。这提出了另一种选择：当出现新的高分时，找到该分数当前最低的 TIMESTAMP 值。将其减 1，并将其用作新分数的“时间戳”（如果这是第一次提交此 SCORE，则使用 4293 作为时间戳）。这将允许每个个人得分值有 4294 个高分。

Although you've subsquently said that you actually have a double available to store the information, I thought I might still share some thoughts on doing this with a 32 bit integer.

Firstly, in order to have these numbers sort in score order first and time order second, you want the score to occupy the higher value places and the timestamp to occupy the lower. To place the score in the higher value places, we must pick a multiply it by some constant factor. The largest number that can be represented by an unsigned 32 bit integer is 4,294,967,295 , and we we have a score range of 0 to 1,000,000. This gives us a multiplier of 4,294.

We then have the lower-value positions occupied by the timestamp - just need to add that in. This gives us:

N = SCORE * 4294 + TIMESTAMP;

The reverse conversions are:

SCORE = N / 4294;
TIMESTAMP = N % 4294;

However, note that the range this allows for the TIMESTAMP is 0 to 4293 inclusive. Anything higher would overflow into the parts of N occupied by the SCORE. If TIMESTAMP is simply a number of seconds (starting at 4293 for the earliest score and counting down), this only gives us a maximum time of just over 71 minutes from the earliest score to the latest - probably insufficient.

This is simply a limit on the number of different buckets you can represent with a 32 bit integer - to be able to represent more distinct timestamps with this model, you need to reduce the number of distinct scores you can represent.

However, note that we don't really want the timestamp as a timestamp - we just want it as a monotonic ordering on the scores. As an alternative, we can keep a counter. Initialise the counter to 4293. When a new score comes in, store it with the current value of the counter as its "timestamp", then decrement the counter. This will allow 4294 distinct highscores to be stored before the counter runs out.

As a further refinement, we can note that we only care about the ordering within the same SCORE value. This suggests another alternative: when a new high score comes in, find the current lowest TIMESTAMP value for that score. Decrement it by 1, and use that for the "timestamp" of the new score (if this is the first time that this SCORE has been submitted, use 4293 as the timestamp). This will allow 4294 high scores per individual score value.

回复收藏 0 原文