在多行文本文件上使用 StringTokenizer 时出错
我正在尝试读取一个文本文件并使用java中的字符串分词器实用程序单独分割单词。
文本文件如下所示;
a 2000
4
b 3000
c 4000
d 5000
现在,我想做的是从文本文件中获取每个单独的字符并将其存储到数组列表中。然后我尝试最后打印数组列表中的每个元素。
这是我的代码;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.StringTokenizer;
public static void main(String[] args) {
String fileSpecified = args[0];
fileSpecified = fileSpecified.concat(".txt");
String line;
System.out.println ("file Specified = " + fileSpecified);
ArrayList <String> words = new ArrayList<String> ();
try {
FileReader fr = new FileReader (fileSpecified);
BufferedReader br = new BufferedReader (fr);
line = br.readLine();
StringTokenizer token;
while ((line = br.readLine()) != null) {
token = new StringTokenizer (line);
words.add(token.nextToken());
}
} catch (IOException e) {
System.out.println (e.getMessage());
}
for (int i = 0; i < words.size(); i++) {
System.out.println ("words = " + words.get(i));
}
}
我收到的错误消息是这样的;
Exception in thread "main" java.util.NoSuchElementException
at java.util.StringTokenizer.nextToken<Unknown Source>
at getWords.main<getWords.java:32>
其中“getWords”是我的 java 文件的名称。
谢谢。
I'm trying to read a text file and split the words individually using string tokenizer utility in java.
The text file looks like this;
a 2000
4
b 3000
c 4000
d 5000
Now, what I'm trying to do is get each individual character from the text file and store it into an array list. I then try and print every element in the arraylist in the end.
Here is my code;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.StringTokenizer;
public static void main(String[] args) {
String fileSpecified = args[0];
fileSpecified = fileSpecified.concat(".txt");
String line;
System.out.println ("file Specified = " + fileSpecified);
ArrayList <String> words = new ArrayList<String> ();
try {
FileReader fr = new FileReader (fileSpecified);
BufferedReader br = new BufferedReader (fr);
line = br.readLine();
StringTokenizer token;
while ((line = br.readLine()) != null) {
token = new StringTokenizer (line);
words.add(token.nextToken());
}
} catch (IOException e) {
System.out.println (e.getMessage());
}
for (int i = 0; i < words.size(); i++) {
System.out.println ("words = " + words.get(i));
}
}
The error message I get is this;
Exception in thread "main" java.util.NoSuchElementException
at java.util.StringTokenizer.nextToken<Unknown Source>
at getWords.main<getWords.java:32>
Where 'getWords' is the name of my java file.
Thankyou.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
a)您始终必须检查 首先是
StringTokenizer.hasMoreTokens()
。如果没有更多可用的标记,则抛出NoSuchElementException
是记录的行为:b) 不要为每一行创建一个新的 Tokenizer,除非您的文件太大而无法放入内存。将整个文件读取为字符串并让分词器对其进行处理
a) You always have to check
StringTokenizer.hasMoreTokens()
first. ThrowingNoSuchElementException
is the documented behaviour if no more tokens are available:b) don't create a new Tokenizer for every line, unless your file is too large to fit into memory. Read the entire file to a String and let the tokenizer work on that
您的一般方法似乎是合理的,但您的代码中有一个基本问题。
您的解析器很可能在输入文件的第二行失败。该行是一个空行,因此当您调用
words.add(token.nextToken());
时,您会收到错误,因为没有标记。这也意味着您只能获得每行的第一个标记。您应该像这样迭代令牌:
您可以在此处的 javadocs 中找到更通用的示例:
http://download.oracle.com/javase/1.4.2/docs/api/java/util/StringTokenizer.html
Your general approach seems sound, but you have a basic problem in your code.
Your parser is most likely failing on the second line of your input file. This line is a blank line, so when you call
words.add(token.nextToken());
you get an error, because there are no tokens. This also means you'll only ever get the first token on each line.You should iterate on the tokes like this:
You can find a more general example in the javadocs here:
http://download.oracle.com/javase/1.4.2/docs/api/java/util/StringTokenizer.html
这个问题是由于您在尝试获取下一个令牌之前没有测试是否有下一个令牌。在调用
nextToken()
之前,您应该始终测试hasMoreTokens()
是否返回true
。但是您还有其他错误:
This problem is due to the fact that you don't test if there is a next token before trying to get the next token. You should always test if
hasMoreTokens()
before returnstrue
before callingnextToken()
.But you have other bugs :
您需要使用 hasMoreTokens() 方法。还解决了 JB Nizet 指出的代码中的各种编码标准问题
You need to use hasMoreTokens() method. Also addressed various coding standard issues in your code as pointed out by JB Nizet