我是否应该在 Android 应用程序上将缓冲读取更改为内存中/标记化以读取 100,000 行文件?

发布于 2024-11-19 16:00:51 字数 367 浏览 4 评论 0原文

目前,我正在使用缓冲读取将包含 100,000 行的文本文件加载到 SortedMap 中。我是否应该放弃这种方法,而是将整个文件加载到内存中,然后通过换行标记到 SortedMap 中?请注意,我必须解析每一行以提取键并创建一个每个键的支持对象,然后将其插入到 SortedMap 中。该文件大小小于 4MB,符合 Android 内存中文件大小限制。我想知道是否值得努力切换到内存中方法,或者所获得的加速是否不值得。

另外,HashMap 会比 SortedMap 快很多吗?我只需要按键查找,并且如果需要的话可以在没有排序键的情况下生存,但如果有的话那就太好了。如果有比我正在使用的更好的结构,请告诉我,如果您有任何与此问题相关的 Android 速度提示,也请提及。

——罗施勒

Currently I am loading a text file that contains 100,000 lines into a SortedMap using buffered reads. Should I abandon this approach and instead load the entire file into memory and then tokenize by line feeds into the SortedMap? Note, I have to parse each line to extract the key and create a per-key supporting object that I then insert into the SortedMap. The file is less than 4MB in size so that fits in line with Android's in-memory file size limitations. I am wondering if it's worth the effort to switch to the in-memory approach or if the speed-up gained just isn't worth it.

Also, would a HashMap be a lot faster than a SortedMap? I only need lookup-by-key and can live without the sorted keys if necessary, but it would be nice to have around. If there is a better structure than what I am using let me know and if you have any Android speed tips related to this issue please mention those too.

-- roschler

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

随波逐流 2024-11-26 16:00:52

我不清楚为什么将整个文件加载到内存中然后标记化会更简单。一次读取一行并以这种方式解析它非常简单,不是吗?虽然我完全赞成一次性加载所有内容,因为它真正使事情变得更简单,但我看不出这里会变得更加容易。

至于 SortedMapHashMap - 如果没有很多哈希冲突,通常 HashMap 查找的时间复杂度为 O(1),但是 如果没有相等的元素,>SortedMap 查找的时间复杂度仅为 O(log n)。与对象模型中的哈希计算相比,比较的成本有多高?对于 100,000 个元素,每次查找将进行大约 16-17 次比较。最终,我不想猜测哪个会更快 - 您应该测试它,就像所有性能选项一样。也看看内存使用情况...我希望 SortedMap 使用更少的内存,但我很容易错。

It's unclear to me why it would be simpler to load the entire file into memory and then tokenize. Reading a line at a time and parsing it that way is pretty simple, isn't it? While I'm all for loading things all at once when it genuinely makes things simpler, I can't see that it would be significantly easier here.

As for SortedMap vs HashMap - typically a HashMap lookup is O(1) if you don't have many hash collisions, but a SortedMap lookup is only O(log n) if there aren't equal elements. How expensive are comparisions compared with hash computations in your object model? With 100,000 elements you'll have around 16-17 comparisons per lookup. Ultimately, I wouldn't want to guess which will be faster - you should test it, as for all performance options. Look at the memory usage too... I would expect a SortedMap to use less memory, but I could easily be wrong.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文