开销与代码速度(java.io.File 数组与 java.lang.String 数组)
只是想解决我在这里遇到的一个小困境。
目前,我正在开发一个应用程序,该应用程序涉及将文件列表收集到内存中以进行删除。现在,在这一点上,我认为 java.io.File 数组可能会占用太多内存,因为此上下文中的文件列表可能有数百个可能的条目。
我认为收集文件名列表并将它们存储为 java.lang.String 会更节省内存,而不是用 File 对象列表消耗过多的内存。现在,这是我的问题:考虑到要删除这些文件的目标,哪个会更便宜:
- 存储 File 对象数组而不是 String 对象,并调用 .delete();循环中的每一个(使用了太多内存)。
- 存储带有文件名的 String 对象数组,但对于循环的每次迭代,使用文件名列表创建一个新的 File 对象,并调用 .delete();在该文件上(这意味着每次循环迭代时,都会创建并销毁一个新的 File 对象 - 可能使用了太多的处理器能力)。
我想让程序尽可能快,所以这两种方法都有其优点,我只是想看看哪种方法的开销最小。提前致谢!
just trying to sort out a small delimma I'm having here.
Currently, I'm working on an application that involves gathering a list of files into memory, to be deleted. Now, at this point, I thought that a java.io.File array would perhaps take up too much memory, since the list of Files in this context could be in the hundreds of possible entries.
Rather than eat excessive amounts of memory up with a list of File objects, I figured that gathering a list of filenames and storing them as a java.lang.String would be cheaper to memory. Now, here's my problem: With the goal in mind that these files are to be deleted, which of these would be cheaper:
- Storing an array of File objects rather than String objects, and calling .delete(); on each one in a loop (too much memory used).
- Storing an array of String objects with the filenames, but for each iteration of the loop, create a new File object using the list of filenames, and call .delete(); on that file (which means each time the loop iterates, a new File object is created and destroyed--possibly too much processor power being used).
I want to make the program as fast as possible, so either approach has its merits, and I just want to see which of these has the least overhead. Thanks in advance!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
java.io.File 表示有关文件系统中条目的文件名信息/元数据,它不包含文件的内容。
换句话说,像
new File("somelarge.txt")
这样的代码不会将somelarge.txt
文件加载到内存中。每个 File 对象包含的唯一真实数据是文件的
String 路径
(以及transient int prefixLength
) - 考虑File
类只是知道如何调用所有文件系统操作的字符串路径
的包装器。除非有其他一些要求,这里最好的选择是最容易阅读并最好地传达您的意图的代码。
The
java.io.File
represents the filename information/metadata about an entry in the filesystem, it does not contain the contents of the file.In other words, code like
new File("somelarge.txt")
does not load thesomelarge.txt
file into memory.The only real data that each File object contains is a
String path
to the File (along with atransient int prefixLength
) - consider theFile
class merely a wrapper around theString path
that knows how to invoke all of the filesystem operations.The best choice here, barring some other requirements, is the code that is the easiest to read and conveys your intent the best.
我不想无礼,但让我首先引用“不惜一切代价避免过早优化”的口头禅。您的代码性能敏感吗?您有内存使用限制吗?循环中数百个
File
对象或数百个File
对象创建听起来都不是那么糟糕。不过,如果您确实想优化,请使用分析器并使用这两种策略运行一些基准测试。我个人推荐 Netbeans Profiler。I don't want to be rude, but let me start by invoking the "Avoid Premature Optimizations at all costs" mantra. Is your code performance sensitive? Do you have memory usage constraints? Neither hundreds of
File
objects or hundreds ofFile
object creations in a loop sounds that bad. Still, if you really feel like optimizing, go with a Profiler and run some benchmarks using both strategies. I would personally recommend Netbeans Profiler.文件主要是字符串的包装,并且比字符串本身多消耗 32 个字节。如果服务器中有 1000 个这样的内存,内存成本约为 70 美元/GB,那么它消耗的额外内存价值约为 0.22 美分。如果您领取最低工资,这大约相当于您 1 秒的时间。
除非您的设备内存有限,否则您可能无需担心任何消耗小于 1 MB 的内容。
A File is largely a wrapper for a String and consumes up to 32 bytes more than the String itself. If you have 1000 of these in a server where memory costs about $70/GB, the extra memory it consumes is worth about 0.22 cents. This is about the same as 1 second of your time if you are on minimum wage.
Unless you have a memory limited device, it is likely you don't need to worry about anything which consumes less than 1 MB.
除非您正在使用资源严重匮乏的系统,否则您不会遇到您认为存在的问题。
请记住,Java File 对象只是“文件和目录路径名的抽象表示。”因此,它代表任何文件的固定内存成本,无论有多大。如果您只处理数百个文件,则几乎肯定不会达到堆空间的任何限制。
如果您创建解决方案并发现使用分析和监视面临内存限制,那么此实现是您最后应该查看的地方之一。根本就没有那么多内存。
所以,简而言之,你应该编写你最理解并且将来能够维护的代码。简单的代码是你的朋友。
Unless you are working with a seriously resource-starved system, you do not have the problem that you think you have.
Remember that a Java File object is only "an abstract representation of file and directory pathnames." As such, it represents a fixed memory cost for any file, no matter how large. If you are dealing with only hundreds of Files, you are almost certainly not approaching any sort of limit on heap space.
If you create a solution and find that you are facing memory limitations using profiling and monitoring, this implementation is one of the last places that you should look. It's simply not that much memory.
So, in short, you should write the code that you understand the best and will be able to maintain in the future. Simple code is your friend.
对我来说,这听起来像是过早的优化,除非
话虽如此,String 对象数组在内存和速度方面击败了 File 对象数组。造成这种情况的原因有几个:
File 对象具有许多私有属性,包括但不限于
私有字符串字段属性
文件系统特定前缀的
瞬态
前缀长度字段File 对象实例化依赖于对 java.io.FileSystem 具体实现的静态引用, File 构造函数调用的它
的成本等于
因此,创建 n 个 File 实例数组 n字符串路径名,成本仅为 就
空间而言,n File 对象数组的内存占用为:
而 n String 对象数组
为简单 起见为了方便起见,我什至不会费心去进行任何估计,而且大多数情况下,这个细节只会在您使用非常有限的设备时才重要。 (以手机为例。)
Sounds like premature optimization to me unless
Having said that, an array of String objects beat an array of File objects in terms of memory and speed. And there are several reasons for this:
A File object has a number of private attributes, including but not limited to
a private String field attribute
a
transient
prefix length field for filesystem-specific prefixesA File object instantiation relies on a static reference to an concrete implementation of java.io.FileSystem, to which the File constructor(s) make calls to it
So, the cost of creating an array of n File instances equals
With an array of n String pathnames, the cost is just
In terms of space, the memory footprint of an array of n File objects will be:
whereas an array of n String objects will be
For simplicity and convenience, I'd go with an array of Strings. I wouldn't even had bothered to do any estimation. Simpler tends to be better most of the time. And this minutia will only matter if you are working with a very constrained device (a mobile phone for instance.)