yes, i think it is going to be biased. uuencode requires 3 bytes for each 4 output characters. since you are giving it 8 bytes the last byte is padding of some (non-random) kind and that is going to bias the 12th character (and slightly affect the 11th too).
can you try
head -c 9 /dev/random | uuencode -m -
(with 9 instead of 8) instead and post the results? that should not have the same problem.
ps also, you will no longer need to drop the "=" padding, since that's a multiple of 3.
pps it certainly appears statistically significant. you expect a natural variation of sqrt(mean), which is (guessing) sqrt(2000) or about 40. so three deviations from that, +/-120, or 1880-2120 should contain 99% of letters - you are seeing something much more systematic.
ppps neat idea.
ooops i just realised -m for uuencode forces base64 rather than the uudecode algorithm, but the same idea applies.
发布评论
评论(1)
是的,我认为这会有偏见。 uuencode 每 4 个输出字符需要 3 个字节。因为你给它 8 个字节,所以最后一个字节是某种(非随机)类型的填充,这将使第 12 个字符产生偏差(并且也会稍微影响第 11 个字符)。
你可以尝试
(用 9 而不是 8)并发布结果吗?那不应该有同样的问题。
另外,您将不再需要删除“=”填充,因为它是 3 的倍数。
http: //en.wikipedia.org/wiki/Uuencoding
pps 它确实具有统计显着性。您期望 sqrt(mean) 的自然变化,即(猜测)sqrt(2000) 或大约 40。因此,与该值的三个偏差,+/-120 或 1880-2120 应该包含 99% 的字母 - 您正在看到一些东西更加系统化。
ppps 好主意。
哎呀我刚刚意识到uuencode的
-m
强制使用base64而不是uudecode算法,但同样的想法也适用。yes, i think it is going to be biased. uuencode requires 3 bytes for each 4 output characters. since you are giving it 8 bytes the last byte is padding of some (non-random) kind and that is going to bias the 12th character (and slightly affect the 11th too).
can you try
(with 9 instead of 8) instead and post the results? that should not have the same problem.
ps also, you will no longer need to drop the "=" padding, since that's a multiple of 3.
http://en.wikipedia.org/wiki/Uuencoding
pps it certainly appears statistically significant. you expect a natural variation of sqrt(mean), which is (guessing) sqrt(2000) or about 40. so three deviations from that, +/-120, or 1880-2120 should contain 99% of letters - you are seeing something much more systematic.
ppps neat idea.
ooops i just realised
-m
for uuencode forces base64 rather than the uudecode algorithm, but the same idea applies.