字符串分词器
有人可以通过在代码中添加一些注释来帮助我理解这个字符串标记器是如何工作的吗?我非常感谢任何帮助,谢谢!
public String[] split(String toSplit, char delim, boolean ignoreEmpty) {
StringBuffer buffer = new StringBuffer();
Stack stringStack = new Stack();
for (int i = 0; i < toSplit.length(); i++) {
if (toSplit.charAt(i) != delim) {
buffer.append((char) toSplit.charAt(i));
} else {
if (buffer.toString().trim().length() == 0 && ignoreEmpty) {
} else {
stringStack.addElement(buffer.toString());
}
buffer = new StringBuffer();
}
}
if (buffer.length() !=0) {
stringStack.addElement(buffer.toString());
}
String[] split = new String[stringStack.size()];
for (int i = 0; i < split.length; i++) {
split[split.length - 1 - i] = (String) stringStack.pop();
}
stringStack = null;
buffer = null;
// System.out.println("There are " + split.length + " Words");
return split;
}
Can anybody help me understand how this string tokenizer works by adding some comments into the code? I would very much appreciate any help thanks!
public String[] split(String toSplit, char delim, boolean ignoreEmpty) {
StringBuffer buffer = new StringBuffer();
Stack stringStack = new Stack();
for (int i = 0; i < toSplit.length(); i++) {
if (toSplit.charAt(i) != delim) {
buffer.append((char) toSplit.charAt(i));
} else {
if (buffer.toString().trim().length() == 0 && ignoreEmpty) {
} else {
stringStack.addElement(buffer.toString());
}
buffer = new StringBuffer();
}
}
if (buffer.length() !=0) {
stringStack.addElement(buffer.toString());
}
String[] split = new String[stringStack.size()];
for (int i = 0; i < split.length; i++) {
split[split.length - 1 - i] = (String) stringStack.pop();
}
stringStack = null;
buffer = null;
// System.out.println("There are " + split.length + " Words");
return split;
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
这不是世界上最好的书面方法!但下面评论。总的来说,它的作用是将字符串拆分为“单词”,并使用字符
delim
来分隔它们。如果ignoreEmpty
为 true,则不计算空字(即两个连续的分隔符充当一个)。您可以使用 string.split 方法编写一个更高效的方法,将分隔符转换为合适的正则表达式(如果
ignoreEmpty
则以+
结尾)是真的)。Not the best written method in the world! But comments below. Overall, what it does is to split a string into "words", using the character
delim
to delimit them. IfignoreEmpty
is true, then empty words are not counted (i.e. two consecutive delimiters act as one).You could write a far more efficient one using the
string.split
method, translating the delimiter into a suitable regular expression (ending with+
ifignoreEmpty
is true).这是直接来自 String.Split 还是其他东西?因为在我看来,代码中存在错误(即使 IgnoreEmpty 为 true,如果最后留下,也会添加空结果)?
Is this directly from String.Split or is it something else? Because it seems to me there's a bug in the code (empty result added if left over at the end even when IgnoreEmpty is true)?
此代码循环遍历一个字符串,通过查找分隔符将其拆分为单词,并返回包含所有找到的单词的字符串数组。
在 C# 中,您可以编写与以下相同的代码:
This code loops through a string, splits it in words by looking for a delimiter and returns a string array with all found words.
In C# you could to write same code as:
好的,在继续回答之前,我应该指出这段代码存在多个问题。这里是:
}
问题:
java.util.StringTokenizer
StringBuffer
!它应该简单地重置StringBuffer
的长度if
可以写得更高效,至少更具可读性也许所有问题都应该通过简单地使用内置
java.util.StringTokenizerr
来解决Ok, before proceeding to the answer, I should point out that there are multiple issues with this code. Here goes:
}
Issues:
java.util.StringTokenizer
StringBuffer
for each token! it should simply reset the length of theStringBuffer
if
s inside the loop can be written more efficiently, more readable at leastProbably all issues should be resolved by simply using the built-in
java.util.StringTokenizerr
这段代码根据给定的分隔符将字符串拆分为子字符串。例如,字符串:
将作为字符串数组返回:
This piece of code splits a string into substrings based on a given delimiter. For example, the string:
would get returned as this array of strings: