如何将整数数组快速打印到控制台?
我有一个整数数组,
a = [1,2,3,4]
当我在
a.join
Ruby 内部调用 to_s 方法 4 次时,这对于我的需求来说太慢了。
将大整数数组输出到控制台的最快方法是什么?
我是说:
a = [1,2,3,4........,1,2,3,9], should be:
1234........1239
I have an array of integers
a = [1,2,3,4]
When I do
a.join
Ruby internally calls the to_s
method 4 times, which is too slow for my needs.
What is the fastest method to output an big array of integers to console?
I mean:
a = [1,2,3,4........,1,2,3,9], should be:
1234........1239
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
如果您想将整数打印到标准输出,则需要首先将其转换为字符串,因为标准输出只能理解这些。如果要将两个整数打印到标准输出,则需要先将它们都转换为字符串。如果要将三个整数打印到标准输出,则需要首先将它们全部转换为字符串。如果你想将 10 亿个整数打印到 stdout,你需要先将这 10 亿个整数全部转换为字符串。
您、我们、Ruby 或者任何编程语言对此都无能为力。
您可以尝试通过执行惰性流实现来将转换与 I/O 交错。您可以尝试通过执行惰性流实现并将转换和 I/O 分离到两个单独的线程中来并行执行转换和 I/O。 (请确保使用实际上可以执行并行线程的 Ruby 实现,但并非所有线程都可以:例如 MRI、YARV 和 Rubinius 就不能。)
您可以通过转换单独的线程来并行化转换。数组中的块在单独的线程中并行。你甚至可以购买十亿核机器并同时并行转换所有十亿个整数。
但即便如此,问题的事实仍然存在:每个整数都需要转换。无论您是先一个接一个地执行该操作,然后打印它们,还是与 I/O 交错地一个接一个地执行,或者与 I/O 并行地一个接一个地执行,甚至同时转换所有它们十亿核 CPU 上的时间:所需的转换次数不会神奇地减少。大量的整数意味着大量的转换。即使您在十亿个核心 CPU 中并行执行所有十亿次转换,它仍然是十亿次转换,即对
to_s
的十亿次调用。If you want to print an integer to stdout, you need to convert it to a string first, since that's all stdout understands. If you want to print two integers to stdout, you need to convert both of them to a string first. If you want to print three integers to stdout, you need to convert all three of them to a string first. If you want to print one billion integers to stdout, you need to convert all one billion of them to a string first.
There's nothing you, we, or Ruby, or really any programming language can do about that.
You could try interleaving the conversion with the I/O by doing a lazy stream implementation. You could try to do the conversion and the I/O in parallel, by doing a lazy stream implementation and separating the conversion and the I/O into two separate threads. (Be sure to use a Ruby implementation which can actually execute parallel threads, not all of them can: MRI, YARV and Rubinius can't, for example.)
You can parallelize the conversion, by converting separate chunks in the array in separate threads in parallel. You can even buy a billion core machine and convert all billion integers at the same time in parallel.
But even then, the fact of the matter remains: every single integer needs to be converted. Whether you do that one after the other first, and then print them or do it one after the other interleaved with the I/O or do it one after the other in parallel with the I/O or even convert all of them at the same time on a billion core CPU: the number of needed conversions does not magically decrease. A large number of integers means a large number of conversions. Even if you do all billion conversions in a billion core CPU in parallel, it's still a billion conversions, i.e. a billion calls to
to_s
.正如上面评论中所述,如果
Fixnum.to_s
的执行速度不够快,那么您确实需要考虑 Ruby 是否是执行此特定任务的正确工具。但是,您可以执行一些操作,这些操作可能适用于您的情况,也可能不适用于您的情况。
如果数组的构建发生在时间关键区域之外,则构建该数组,或者使用字符串而不是整数构建数组的副本。通过我对 10000 个整数的小测试,速度大约快了 5 倍。
如果您同时控制读取和写入过程,则使用 Array.pack 写入输出并使用 String.unpack 读取结果。这可能不会更快,因为即使元素已经是整数,pack 似乎也会调用 Fixnum.to_int 。
我预计这些数字对于每个 Ruby 版本都会有所不同,因此值得检查您的特定目标版本。
As stated in the comments above if
Fixnum.to_s
is not performing quickly enough for you then you really need to consider whether Ruby is the correct tool for this particular task.However, there are a couple of things you could do that may or may not be applicable for your situation.
If the building of the array happens outside the time critical area then build the array, or a copy of the array with strings instead of integers. With my small test of 10000 integers this is approximately 5 times faster.
If you control both the reading and the writing process then use
Array.pack
to write the output andString.unpack
to read the result. This may not be quicker as pack seems to callFixnum.to_int
even when the elements are already Integers.I expect these figures would be different with each version of Ruby so it is worth checking for your particular target version.
程序中的缓慢并不是由于
to_s
被调用了 4 次,而是由于打印到控制台。控制台输出很慢,您对此无能为力。The slowness in you program does not come from
to_s
being called 4 times, but from printing to the console. Console output is slow, and you can't really do anything about it.对于单位数字,你可以这样做
如果你需要加速更大的数字,你可以尝试记住
to_s
的结果For single digits you can do this
If you need to speed up larger numbers you could try memoizing the result of
to_s
除非您确实需要在控制台上查看数字(听起来您不需要),否则将它们写入二进制文件 - 应该会快得多。
如果您需要这样做,您可以将二进制文件通过管道传输到其他程序中,而不仅仅是文本。
Unless you really need to see the numbers on the console (and it sound like you do not) then write them to a file in binary - should be much faster.
And you can pipe binary files into other programs if that is what you need to do, not just text.