Android - 读取文本文件时内存不足
我正在 Android 上制作一个字典应用程序。在启动过程中,应用程序将加载 .index 文件的内容(~2MB,100.000+ 行)
但是,当我使用 BufferedReader.readLine() 并对返回的字符串执行某些操作时,应用程序将导致 OutOfMemory。
// Read file snippet
Set<String> indexes = new HashSet<String)();
FileInputStream is = new FileInputStream(indexPath);
BufferedReader reader = new BufferedReader(new InputStreamReader(is));
String readLine;
while ( (readLine = reader.readLine()) != null) {
indexes.add(extractHeadWord(readLine));
}
// And the extractHeadWord method
private String extractHeadWord(String string) {
String[] splitted = string.split("\\t");
return splitted[0];
}
在读log的时候,发现执行时,导致GC多次显式清理对象(GC_EXPLICIT释放了xxx个对象,其中xxx是15000、20000等很大的数字)。
我尝试了另一种方法:
final int BUFFER = 50;
char[] readChar = new char[BUFFER];
//.. construct BufferedReader
while (reader.read(readChar) != -1) {
indexes.add(new String(readChar));
readChar = new char[BUFFER];
}
..它运行得非常快。但这并不完全是我想要的。
是否有任何解决方案可以像第二个片段一样运行快速并且像第一个片段一样易于使用?
看待。
I'm making a dictionary app on android. During its startup, the app will load content of .index file (~2MB, 100.000+ lines)
However, when i use BufferedReader.readLine() and do something with the returned string, the app will cause OutOfMemory.
// Read file snippet
Set<String> indexes = new HashSet<String)();
FileInputStream is = new FileInputStream(indexPath);
BufferedReader reader = new BufferedReader(new InputStreamReader(is));
String readLine;
while ( (readLine = reader.readLine()) != null) {
indexes.add(extractHeadWord(readLine));
}
// And the extractHeadWord method
private String extractHeadWord(String string) {
String[] splitted = string.split("\\t");
return splitted[0];
}
When reading log, I found that while executing, it causes the GC explicitly clean objects many times (GC_EXPLICIT freed xxx objects, in which xxx is a big number such as 15000, 20000).
And I tried another way:
final int BUFFER = 50;
char[] readChar = new char[BUFFER];
//.. construct BufferedReader
while (reader.read(readChar) != -1) {
indexes.add(new String(readChar));
readChar = new char[BUFFER];
}
..and it run very fast. But it was not exactly what I wanted.
Is there any solution that run fast as the second snippet and easy to use as the first?
Regard.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
extractHeadWord
使用String.split
方法。此方法不会创建新字符串,而是依赖于基础字符串(在您的情况下为 line 对象)并使用索引来指出“新”字符串。由于您对字符串的其余部分不感兴趣,因此您需要丢弃它,以便它被垃圾收集,否则整个字符串将在内存中(但您只使用它的一部分)。
调用构造函数
String(String)
(“复制构造函数”)丢弃字符串的其余部分:The
extractHeadWord
usesString.split
method. This method does not create new strings but relies on the underlying string (in your case theline
object) and uses indexes to point out the "new" string.Since you are not interessed in the rest of the string you need to discard the it so it gets garbage collected otherwise the whole string will be in memory (but you are only using a part of it).
Calling the constructor
String(String)
("copy constructor") discards the rest of string:如果您的
extractHeadWord
执行此return new String(splitted[0]);
操作,会发生什么情况。它不会减少临时对象,但可能会减少应用程序的占用空间。我不知道 split 的作用是否与 substring 相同,但我想是的。 substring 在原始数据上创建一个新视图,这意味着完整的字符数组将保留在内存中。显式调用
new String(string)
将截断数据。What happens if your
extractHeadWord
does thisreturn new String(splitted[0]);
.It will not reduce temporary objects, but it might reduce the footprint of the application. I don't know if split does about the same as substring, but I guess that it does. substring creates a new view over the original data, which means that the full character array will be kept in memory. Explicitly invoking
new String(string)
will truncate the data.