Java 扫描器令人头疼
我有一个文本文件,如下所示:
name1
1 0 1 0 1
0 1 1 1 0
0 0 0 0 0
name2
1 0 1 0 1
0 0 1 1 0
0 0 0 0 1
即一个纯文本标签,后跟几行,其中 1/0 用空格分隔。 1/0 的行数是可变的,但任何两个特定标签之间的每一行都应具有相同数量的 1/0(尽管可能不是)。
如何使用扫描仪抓取每个名称+行块?是否有任何优雅的方法来强制行数的一致性(并在不一致时提供某种反馈)?
我认为可能有一种巧妙的分隔符规范的便捷方法,但我似乎无法实现这一点。
I have a text file which looks like:
name1
1 0 1 0 1
0 1 1 1 0
0 0 0 0 0
name2
1 0 1 0 1
0 0 1 1 0
0 0 0 0 1
i.e., a plaintext label followed by a few rows with 1/0 separated by spaces. The number of rows of 1/0 is variable, but each row between any two particular labels should have the same number of 1/0s (though might potentially not).
How do I grab each name+rows chunk with a scanner? Is there any elegant way to enforce the consistency on the number of rows (and provide some sort of feedback if they aren't consistent)?
I'm thinking there might be a convenient way with clever delimiter specification, but I can't seem to get that working.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我会用简单的方法来做。将每一行作为
String
获取,并通过与 1 或 0 后跟空格模式匹配的正则表达式来提供它。如果匹配,则将其视为一行。如果不是,请将其视为纯文本标签。通过检查每个标签的数据数组是否与第一个标签的数据数组的大小匹配来检查行列大小的一致性。编辑:我不知道
Scanner
类,尽管它听起来很方便。我认为基本思想应该仍然大致相同......使用Scanner
解析您的输入,并自己处理尺寸问题。另外,理论上,您可以生成一个与标签和整个数组匹配的正则表达式,尽管我不知道您是否可以生成一个保证它只匹配每组具有相同数量值的行的正则表达式。排。但是,为了设置更自动化的检查,您可能需要构造第二个正则表达式,该正则表达式与第一个条目的数组大小完全匹配,并将其用于所有其他条目。我认为这是一个治疗方法比疾病本身更糟糕的情况。
I would do it the simple way. Grab each line as a
String
, and feed it through, say, a regular expression that matches the 1-or-0-followed-by-space pattern. If it matches, treat it like a row. If not, treat it like a plaintext label. Check for the row-column-size consistency after the fact by checking that every label's array of data matches the size of the first label's array of data.EDIT: I wasn't aware of the
Scanner
class, although it sounds handy. I think the essential idea should still be roughly the same...use theScanner
to parse your input, and handle the question of the sizes yourself.Also, in theory, you could produce a regular expression that would match the label and the entire array, although I don't know if you can produce one that will guarantee that it only matches sets of lines with the same number of values in each row. But then, to set up more automated checking, you'd probably need to construct a second regular expression that exactly matches the array size of the first entry, and use it for all the others. I think this is a case where the cure is worse than the disease.
更好的是,在对另一个问题有帮助的回答之后(感谢巴特):
Even better, after a helpful answer to another question (thanks Bart):
您需要打开文件并使用 readLine() 循环遍历每一行,直到到达文件末尾。
-- 我假设您在遍历文件时正在保持一致性。如果您想存储信息并在以后使用它,我会考虑使用某种类型的数据结构。
当您遍历此行时,您可以使用简单的正则表达式检查该行,以检查它是否是标签名称。如果没有,则根据“ ”(空格字符)拆分行,它将以数组形式返回给您。然后在尺寸一致的基础上检查尺寸。
基本伪代码:
如果您不知道每行的预期大小,您还可以添加另一个循环,并放入一些逻辑来查找最常见的大小,然后找出不匹配的内容。我不确定你的一致性检查需要有多复杂。
You would need to open the file and loop through every line with readLine() until you hit the end of the file.
-- I assumed you are doing consistency as you traverse the file. If you want to store the information and use it later, I would consider using some type of data structure.
As you traverse this, you can check the row with a simple regex to check if it is a label name. If not, split the row based on the ' ' (space character) and it will return to you in an array. Then check the size based on a consistent size.
Basic pseudocode:
You could also add another loop if you don't know the size you expect for each row and put some logic in to find the most common size and then figure out what doesn't match. I am unsure of how complicated your consistency checking needs to be.