将 StringTokenizer 替换为 String.split(..)

发布于 2024-11-05 23:15:38 字数 397 浏览 1 评论 0原文

是否可以构建一个与 Java Pattern.split(..) 方法一起使用的正则表达式来重现 StringTokenizer("...", "...", true)代码>行为?

这样输入就被分割为预定义标记字符和它们之间运行的任何任意字符串的交替序列。

JRE 参考声明 StringTokenizer 应该被视为已弃用,并且可以使用 String.split(..) 代替。所以那里被认为是可能的。

我想使用 split 的原因是正则表达式通常是高度优化的。例如,StringTokenizer 在 Android 平台 VM 上速度相当慢,而正则表达式模式似乎是由优化的本机代码执行的。

Is it possible to build a regexp for use with Javas Pattern.split(..) method to reproduce the StringTokenizer("...", "...", true) behaveiour?

So that the input is split to an alternating sequence of the predefined token characters and any abitrary strings running between them.

The JRE reference states for StringTokenizer it should be considered deprecated and String.split(..) could be used instead way. So it is considered possible there.

The reason I want to use split is that regular expressions are often highly optimized. The StringTokenizer for example is quite slow on the Android Platforms VM, while regex patterns are executed by optimized native code there it seems.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

贵在坚持 2024-11-12 23:15:38

考虑到 split 的文档没有指定这种行为,并且只有一个可选参数来告诉数组应该有多大......不,你不能。

另外看看我能想到的唯一一个可能具有此功能的其他类 - 扫描仪 - 它也没有。所以我认为最简单的方法是继续使用 Tokenizer,即使它已被弃用。比编写自己的类更好 - 虽然这应该不会太难(实际上相当微不足道),但我可以想出更好的方式来度过时间。

Considering that the documentation for split doesn't specify this behavior and has only one optional parameter that tells how large the array should be.. no you can't.

Also looking at the only other class I can think of that could have this feature - a scanner - it doesn't either. So I think the easiest would be to continue using the Tokenizer, even if it's deprecated. Better than writing your own class - while that shouldn't be too hard (quite trivial really) I can think of better ways to spend ones time.

白云不回头 2024-11-12 23:15:38

正则表达式模式可以帮助你

Patter p = Pattern.compile("(.*?)(\\s*)");
//put the boundary regex in between the second brackets (where the \\s* now is)
Matcher m = p.matcher(string);
int endindex=0;
while(m.find(endindex)){
//m.group(1) is the part between the pattern
//m.group(2) is the match found of the pattern
endindex = m.end();
}
//then the remainder of the string is string.substring(endindex);

a regex Pattern can help you

Patter p = Pattern.compile("(.*?)(\\s*)");
//put the boundary regex in between the second brackets (where the \\s* now is)
Matcher m = p.matcher(string);
int endindex=0;
while(m.find(endindex)){
//m.group(1) is the part between the pattern
//m.group(2) is the match found of the pattern
endindex = m.end();
}
//then the remainder of the string is string.substring(endindex);
反话 2024-11-12 23:15:38
import java.util.List;
import java.util.LinkedList;
import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class Splitter {


public Splitter(String s, String delimiters) {
    this.string = s;
    this.delimiters = delimiters;
    Pattern pattern = Pattern.compile(delimiters);
    this.matcher = pattern.matcher(string);
}

public String[] split() {
    String[] strs = string.split(delimiters);
    String[] delims = delimiters();
    if (strs.length == 0) { return new String[0];}
    assert(strs.length == delims.length + 1);
    List<String> output = new LinkedList<String>();
    int i;
    for(i = 0;i < delims.length;i++) {
        output.add(strs[i]);
        output.add(delims[i]);
    }
    output.add(strs[i]);
    return output.toArray(new String[0]);
}

private String[] delimiters() {
    List<String> delims = new LinkedList<String>();
    while(matcher.find()) {
        delims.add(string.subSequence(matcher.start(), matcher.end()).toString());
    }
    return delims.toArray(new String[0]);
}

public static void main(String[] args) {
    Splitter s = new Splitter("a b\tc", "[ \t]");
    String[] tokensanddelims = s.split();
    assert(tokensanddelims.length == 5);
    System.out.print(tokensanddelims[0].equals("a"));
    System.out.print(tokensanddelims[1].equals(" "));
    System.out.print(tokensanddelims[2].equals("b"));
    System.out.print(tokensanddelims[3].equals("\t"));
    System.out.print(tokensanddelims[4].equals("c"));
}


private Matcher matcher;
private String string;
private String delimiters;
}
import java.util.List;
import java.util.LinkedList;
import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class Splitter {


public Splitter(String s, String delimiters) {
    this.string = s;
    this.delimiters = delimiters;
    Pattern pattern = Pattern.compile(delimiters);
    this.matcher = pattern.matcher(string);
}

public String[] split() {
    String[] strs = string.split(delimiters);
    String[] delims = delimiters();
    if (strs.length == 0) { return new String[0];}
    assert(strs.length == delims.length + 1);
    List<String> output = new LinkedList<String>();
    int i;
    for(i = 0;i < delims.length;i++) {
        output.add(strs[i]);
        output.add(delims[i]);
    }
    output.add(strs[i]);
    return output.toArray(new String[0]);
}

private String[] delimiters() {
    List<String> delims = new LinkedList<String>();
    while(matcher.find()) {
        delims.add(string.subSequence(matcher.start(), matcher.end()).toString());
    }
    return delims.toArray(new String[0]);
}

public static void main(String[] args) {
    Splitter s = new Splitter("a b\tc", "[ \t]");
    String[] tokensanddelims = s.split();
    assert(tokensanddelims.length == 5);
    System.out.print(tokensanddelims[0].equals("a"));
    System.out.print(tokensanddelims[1].equals(" "));
    System.out.print(tokensanddelims[2].equals("b"));
    System.out.print(tokensanddelims[3].equals("\t"));
    System.out.print(tokensanddelims[4].equals("c"));
}


private Matcher matcher;
private String string;
private String delimiters;
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文