如何使用正则表达式来解析Java中的文件?

发布于 2024-10-16 21:39:33 字数 338 浏览 4 评论 0原文

我正在尝试使用一系列正则表达式来解析文件中的标记。我需要计算换行符,并能够分隔它们之间没有空格的标记。不幸的是,java.util.Scanner 的 findWithinHorizo​​n() 方法会搜索输入流的整个其余部分(直到地平线)以查找正则表达式匹配的开始,但我想匹配从当前文件位置开始的正则表达式。具体来说,我有一堆正则表达式,想要循环遍历它们以查看哪一个从文件中的当前位置开始匹配,然后将文件位置前进到正则表达式匹配后的右侧,然后继续。这可能吗?

Scanner 的 next() 方法对此似乎没有用,因为它强制使用分隔符,并且正则表达式必须匹配整个标记;我想从当前文件位置开始匹配,获取匹配的字符串,并将文件查找前进到匹配之后。

I'm trying to use a series of regular expressions to parse tokens from a file. I need to count newlines and be able to separate tokens that don't have a space between them. Unfortunately java.util.Scanner's findWithinHorizon() method searches the entire rest of the input stream (up to horizon) for the START of the regex match, but I want to match the regex starting at the current file position. Specifically, I have a bunch of regex's and want to loop through them to see which one matches starting at the current position in the file, and then advance the file position to right after the regex match, and continue. Is this possible?

Scanner's next() method seems to be useless for this because it enforces delimiters and the regex must match the entire token; I want to match from the current file position, get the matched string, and advance the file seek to after the match.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

初熏 2024-10-23 21:39:34

选项:

  1. 将整个文件作为字符串读入内存。然后直接在您想要的位置使用 Matcher

  2. 使用从RandomAccessFile获取的FileChannel作为Scanner的输入。然后,您可以直接操纵通道的位置。

  3. 如上使用 FileChannel,但直接使用 Matcher 以获得更大的灵活性。

将 Matcher 与 RandomAccessFile 结合使用的示例:

FileChannel fc = file.getChannel();
fc.lock(); // so it doesn't change under you

ByteBuffer bb = ByteBuffer.allocate(BUFFER_SIZE);
CharBuffer cb = bb.asCharBuffer();

fc.read(bb);
Matcher matcher = pattern.matcher(cb);
// etc.

Options:

  1. Read the whole file into memory as a String. Then use Matcher directly at the positions you want to.

  2. Use a FileChannel acquired from a RandomAccessFile as the input for the Scanner. You can then directly manipulate the position of the channel.

  3. Use a FileChannel as above, but use Matcher directly for greater flexibility.

An example of using a Matcher with a RandomAccessFile:

FileChannel fc = file.getChannel();
fc.lock(); // so it doesn't change under you

ByteBuffer bb = ByteBuffer.allocate(BUFFER_SIZE);
CharBuffer cb = bb.asCharBuffer();

fc.read(bb);
Matcher matcher = pattern.matcher(cb);
// etc.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文