在Java中,如何从后到前迭代文本文件中的行

发布于 2024-08-27 18:36:29 字数 806 浏览 12 评论 0原文

基本上我需要获取一个文本文件,例如:

弗雷德
伯尼
亨利

并能够按照以下顺序从文件中读取它们

亨利
伯尼
弗雷德

我正在读取的实际文件大于 30MB,读取整个文件、将其拆分为一个数组、反转数组,然后从那里开始,这将是一个不太完美的解决方案。这需要太长的时间。我的具体目标是找到字符串的第一次出现(在本例中为“InitGame”),然后返回该行开头的位置。

我之前在 python 中做过类似的事情。我的方法是寻找文件的末尾 - 1024,然后读取行直到到达末尾,然后从我之前的起点寻找另一个 1024,并且通过使用tell(),当我到达前一个位置时我会停止起点。因此,我会从文件末尾向后读取这些块,直到找到我要查找的文本。

到目前为止,我在 Java 中做这件事很开心。任何帮助将不胜感激,如果您住在巴尔的摩附近,您甚至可能会得到一些新鲜出炉的饼干。

谢谢!

更多信息:

我需要向后搜索,因为我正在读取的文件是我托管服务器的游戏的日志文件(这是关于城市恐怖的 |err| 服务器。检查一下)。日志文件记录了游戏中发生的每个事件,然后我的程序将解析每个事件,处理它,然后对其采取行动(例如,它会跟踪人们的爆头,并且还会自动踢掉那些被d-bags的人) )。我需要搜索回最新的 InitGame 条目,以便可以实例化所有玩家对象并处理自游戏开始以来需要处理的任何其他事情。文件中有数百个 InitGame 事件,但我想要最后一个。如果有更好的方法不需要向后搜索,请告诉我。

谢谢

Basically I need to take a text file such as :

Fred
Bernie
Henry

and be able to read them from the file in the order of

Henry
Bernie
Fred

The actual file I'm reading from is >30MB and it would be a less than perfect solution to read the whole file, split it into an array, reverse the array and then go from there. It takes way too long. My specific goal is to find the first occurrence of a string (in this case it's "InitGame") and then return the position beginning of the beginning of that line.

I did something like this in python before. My method was to seek to the end of the file - 1024, then read lines until I get to the end, then seek another 1024 from my previous starting point and, by using tell(), I would stop when I got to the previous starting point. So I would read those blocks backwards from the end of the file until I found the text I was looking for.

So far, I'm having a heck of a time doing this in Java. Any help would be greatly appreciated and if you live near Baltimore it may even end up with you getting some fresh baked cookies.

Thanks!

More info:

I need to search backwards because the file I am reading is a logfile for a game that I host a server for (it's the |err| server on urban terror. check it out). The log file records every event that happens in the game and then my program will parse each event, process it and then act on it (for example, it keeps track of headshots for people and also will automatically kick people who are being d-bags). I need to search back to the most recent InitGame entry so that I can instantiate all the player objects and take care of whatever else needed to be taken care of since the beginning of that game. There are hundreds of InitGame events in the file, but I want the last one. If there is a better way of doing this that doesn't require searching backwards, please let me know.

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

手长情犹 2024-09-03 18:36:29

您可以使用 RandomAccessFile 重复您的 Python 解决方案,并且可以是其之上的 LineNumberReader (或只是 Reader)的自定义子类。

You can just repeat your Python solution using RandomAccessFile and may be a custom subclass of LineNumberReader (or just Reader) on top of it.

我不吻晚风 2024-09-03 18:36:29

Linux 有一些很棒的文本解析工具,它们可能比尝试使用 Java 更适合。

Linux has some great text parsing tools that may be better suited than trying to do it in Java.

卷耳 2024-09-03 18:36:29

回顾过去,我想到了两个答案。第一种是向前搜索,并在到达文件末尾时保留最后找到的 InitGame 文本(并在您读取文件时每当出现另一个 InitGame 时覆盖它)。

第二种解决方案是找出文件大小(使用 f.length()),将其划分为重叠超过 InitGame 片段最大大小的大块(以避免由于在有趣的地方拆分两个块而导致的问题)部分),然后从最后一个开始阅读并朝着文件开头前进(使用 Reader 的skip() 函数跳转到您想要的阅读位置:不需要实际的文件分割)。如果您确定没有有趣的多字节字符,RandomAccessFile 可能会很有用。

当然,最有效的解决方案是读取日志文件输出,并保留对最后找到的 InitGame 的引用。这样您就不必重复读取相同的数据两次。您甚至可以进行设置,以便您的 java 程序每隔几秒唤醒一次,查看文件并读取新添加的行。

On searching backwards, two answers come to mind. The first is to search forwards, and keep the last-found InitGame text around for the moment when you reach the end of the file (and overwrite it whenever another InitGame comes along as you are reading the file).

The second solution is to find out the file-size (using f.length()), divide that into large chunks that overlap by more than the maximum size of an InitGame snippet (to avoid problems due to splitting two chunks right on the interesting part), and start reading from the last one and progressing towards the file start (using a Reader's skip() function to jump to your desired reading position: no actual file-splitting is necessary). If you are sure that there are no funny multi-byte chars, RandomAccessFile can be useful.

The most efficient solution, of course, is to read the log-file output as it comes out, keeping a reference to the the last-found InitGame. That way you will never have to re-read the same data twice. You can even set things up so that your java program wakes up once every few seconds, looks at the file, and reads in the newly-added lines.

橘和柠 2024-09-03 18:36:29

所以,当我准确解释我在做什么时,我需要更详细。基本上我正在编写一个程序来管理我运行的游戏服务器。为了使程序与游戏同步,它需要找到最新的 InitGame 行,然后从那里读取,以便它可以记录从回合开始时所需的所有击中、击杀、连接和断开连接。由于日志文件可能相当大(上次我忘记清理它超过 500MB 的文本),我不想从前面搜索,而是想从后面搜索。在 Java 中没有内置的方法可以做到这一点。在搜索了大量互联网后,我发现了这个: http://mattfleming.com/node/11 。从那里我取出了 BackwardsFileInputStream 类并使用它。然后在我的应用程序中,我反转了字符。下次我应该能够构建自己的方法,现在我看到了它是如何完成的并且有了更好的理解。

因此,一旦程序从最近的 InitGame 读取日志文件,它将模仿 tail -f 并在写入时读取日志文件。

So, TIL that I need to be more verbose when I explain exactly what I'm doing. Basically I am writing a program that manages a game server that I run. In order for the program to be in sync with the game it needs to find the most recent InitGame line and then read from there so that it can record all this hits, kills, connects and disconnects that it needs to from the beginning of the round. Since a logfile can be quite huge (the last time I forgot to clean one up it was more than 500MB of text), rather than searching from the front, I want to search from the back. In Java there was no built-in way to do this. After searching over a good amount of the internets, I came upon this: http://mattfleming.com/node/11. From that I took out the BackwardsFileInputStream class and used that. Then in my application, I reverse the chars. Next time I should be able to construct my own method, now that I see how it's done and have a better understanding.

So, once the program has read the logfile from the most recent InitGame, it will mimic tail -f and read the logfile as it is written.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文