在 BASH shell 中使用 awk 生成随机数
我希望随机打乱文件的行(行),然后打印到不同的五个文件。
但我在 file1 到 file5 中出现的行顺序始终完全相同。随机生成过程无法正常工作。如果有任何建议,我将不胜感激。
#!/bin/bash
for i in seq 1 5
do
awk 'BEGIN{srand();} {print rand()"\t"$0}' shuffling.txt | sort -k2 -k1 -n | cut -f2- > file$i.txt
done
输入shuffle.txt
111 1032192
111 2323476
111 1698881
111 2451712
111 2013780
111 888105
112 2331004
112 1886376
112 1189765
112 1877267
112 1772972
112 574631
I wish to shuffle the lines (the rows) of a file at random then print out to different five files.
But I keep having exactly the same order of lines appeared in file1 to file5. The random generation process does not work properly. I would be grateful for any advices.
#!/bin/bash
for i in seq 1 5
do
awk 'BEGIN{srand();} {print rand()"\t"$0}' shuffling.txt | sort -k2 -k1 -n | cut -f2- > file$i.txt
done
Input shuffling.txt
111 1032192
111 2323476
111 1698881
111 2451712
111 2013780
111 888105
112 2331004
112 1886376
112 1189765
112 1877267
112 1772972
112 574631
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果您不向 srand 提供种子,它将使用当前日期和时间或固定的起始种子(这可能因实现而异)。这意味着,对于前者,如果您的进程运行得足够快,它们都将使用相同的种子并生成相同的序列。
而且,对于后者,无论您等待多久,每次运行都会得到相同的序列。
您可以通过使用 shell 提供的不同种子来解决这些问题。
$RANDOM 提供的数字在每次迭代中都会发生变化,因此每次运行 awk 程序都会获得不同的种子。
您可以在以下文字记录中看到这一点:
If you don't provide a seed to
srand
, it will either use the current date and time or a fixed starting seed (this may vary with the implementation). That means, for the former, if your processes run fast enough, they'll all use the same seed and generate the same sequence.And, for the latter, it won't matter how long you wait, you'll get the same sequence each time you run.
You can get around either of these by using a different seed, provided by the shell.
The number provided by
$RANDOM
changes in each iteration so each run of theawk
program gets a different seed.You can see this in action in the following transcript:
awk 的伪随机不是很随机,您需要不断播种,在大多数情况下您应该能够使用微秒,否则您可能需要查看
Bash ${RANDOM}
或点击/dev/urandom
直接:awk 'BEGIN{"date +%N"|getline rseed;srand(rseed);close("date +%N");print rand()}'
Awk's pseudo-random is not very random, you need to keep seeding, you should be able to use microseconds for most situations, otherwise you may want to look into
Bash ${RANDOM}
or hitting/dev/urandom
direct:awk 'BEGIN{"date +%N"|getline rseed;srand(rseed);close("date +%N");print rand()}'