Java的PushbackReader和EOF
我正在用 Java 编写一个解析器,并尝试利用 Java 的 PushbackReader。如果我的解析器猜错了,它可能需要回溯 - 但一旦读者到达 EOF,就会失败。
假设我正在解析一个带引号的字符串,并且正在寻找结束引号。如果我的任何解析器插件无法完全完成,它们会尝试让阅读器保持原始状态并将其传递给下一个插件。 IE:我通常将字符推回缓冲区,并让下一个元素尝试解析缓冲区。
不幸的是,如果我一直“读”到最后一个字符......然后读 EOF,PushbackReader 将不允许我将任何内容推回到它上面。因此,我的解析无法完成,因为这些字符丢失了!
我需要为这种类型的字符串处理编写自己的阅读器吗?
编辑:此外,当我读取最后一个字符(EOF 之前的字符)时,我也无法“取消读取”该字符。是否有一个标准的解决方法 - 缺少创建我自己的堆栈或缓冲区实现?
I am writing a parser in Java and trying to leverage Java's PushbackReader. My parser may need to backtrack if it guessed incorrectly - but once the reader reaches EOF, that fails.
Let's say I am parsing a quoted String and I'm looking for the closing quotes. If any of my parser plugins can't completely finish, they try to leave the reader in the original state and pass it to the next plugin. IE: I generally push chars back onto the buffer and let the next element try to parse the buffer.
Unfortunately, if I 'read' all the way to the last character ... and then read the EOF, the PushbackReader will not allow me to push anything back onto it. Consequently, my parsing can't complete since those chars are lost!
Do I need to write my own reader for this type of string processing?
EDIT: furthermore, when I read the last character (the one just before EOF), I can't "unread" that char either. Is there a standard workaround for this - short of creating my own stack or buffer implementation?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我很欣赏这个问题和答案,因为它节省了我寻找不适合我需要的解决方案(PushbackLineReader)的时间。
我正在解析一个日志文件,需要一个包含
模式一
和模式二
的数据块,如图所示。所以我需要读取一整行,然后当我读取超过模式二的一行时,我需要将刚刚读取的行推回。
我决定编写自己的类来做到这一点。考虑到我只有一条线可以推回,这很简单
I appreciated this question and answer because it saved me time looking into a solution that wasn't suited for what I needed which was a PushbackLineReader.
I was parsing a log file and needed a block of data with
pattern one
andpattern two
in it as shown.So I needed to read an entire line and then when I read one line past the line that had
pattern two
, I need to push back the line just read.I decided writing my own class to do that. Which given I only had one line to push back was simple enough
请注意,切勿未读取 EOF 标记,PushbackReader 内的缓冲区的类型为 char[],因此整数 -1 将转换为 char 0xFFFF,这将是从 read 方法返回的下一个字符。例如,在解析带引号的字符串时,除了结束引号字符之外,还始终包含对 -1 的检查,并将其作为失败情况处理,例如通过抛出 IOException。
Be careful to never unread the EOF marker, the buffer inside PushbackReader is of type char[], so the integer -1 will get converted to char 0xFFFF, which will then be the next character returned from the read method. For example when parsing a quoted String always include a check for -1 in addition to the ending quote character and handle this as a failure case, for example by throwing an IOException.