获取标签之间的子字符串

发布于 2024-11-25 21:37:49 字数 605 浏览 1 评论 0原文

我读过一些关于按标签解析字符串的问题，但我没有找到我的具体问题的确切答案。 问题：我有一大行文本。我需要根据标签将此字符串解析为多个字符串。示例：我找到 [tag] 然后我读取文本直到 [tag] 并将其获取到一个新字符串。然后，我在出现相同的 [tag] 之前阅读文本，并将这些数据发布到新字符串，依此类推。

例子： [tag] Lorem Ipsum [tag] 只是印刷和排版行业的虚拟文本。自 1500 年代以来，Lorem Ipsum 一直是行业标准的虚拟文本，当时一位不知名的印刷商拿走了一堆字体并将其打乱以制作一本字体样本簿。 [tag] [tag]不仅生存了五个世纪，而且跨越了电子排版，基本保持不变。它在 20 世纪 60 年代随着包含 Lorem Ipsum 段落的 Letraset 表的发布而流行起来，最近又随着包含 Lorem Ipsum 版本的 Aldus PageMaker 等桌面出版软件而流行。

我想要基于此文本的三个字符串：Lorem Ipsum，它有，此文本之间的文本

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

另类 2024-12-02 21:37:49

String txt = "[tag] Lorem Ipsum [tag] is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. [tag] It has [tag] survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.";

int index = -1;
while (true)
{
    int i = txt.indexOf("[tag]", index+1);
    if (i == -1) break;
    if (index == -1)
    {
        index = i;
    } else
    {
        System.out.println(txt.substring(index + 5, i));
        index = i;
    }

}

String txt = "[tag] Lorem Ipsum [tag] is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. [tag] It has [tag] survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.";

int index = -1;
while (true)
{
    int i = txt.indexOf("[tag]", index+1);
    if (i == -1) break;
    if (index == -1)
    {
        index = i;
    } else
    {
        System.out.println(txt.substring(index + 5, i));
        index = i;
    }

}

回复收藏 0 原文

牵强ㄟ 2024-12-02 21:37:49

正则表达式来救援！

LinkedList<String> matches = new LinkedList<String>();
Pattern pattern = Pattern.compile("\\[tag\\].*?\\[tag\\]");
Matcher matcher = pattern.matcher(str);

while(matcher.find())
    matches.add(matcher.group());

或者，您也可以手动浏览字符串。

int index = -1;

while( str.indexOf("[tag]",index+1) != -1 ) {
    String s = str.substring( index = str.indexOf("[tag]",index+1)+5, index = str.indexOf("[tag]",index) );
    System.out.println(s);
}

Regular expressions to the rescue!

LinkedList<String> matches = new LinkedList<String>();
Pattern pattern = Pattern.compile("\\[tag\\].*?\\[tag\\]");
Matcher matcher = pattern.matcher(str);

while(matcher.find())
    matches.add(matcher.group());

Alternatively you could just go through the String manually.

int index = -1;

while( str.indexOf("[tag]",index+1) != -1 ) {
    String s = str.substring( index = str.indexOf("[tag]",index+1)+5, index = str.indexOf("[tag]",index) );
    System.out.println(s);
}

回复收藏 0 原文

_失温 2024-12-02 21:37:49

使用String类的split方法。它需要正则表达式作为参数：

String allText = "some[tag]text[tag]separated[tag]by tags";
String[] textBetweenTags = allText.split("\\[tag\\]");
for (int i = 0; i < textBetweenTags.length; i++) {
    System.out.println(textBetweenTags[i]);
}

Use split method of String class. It expects regular expression as a parameter:

String allText = "some[tag]text[tag]separated[tag]by tags";
String[] textBetweenTags = allText.split("\\[tag\\]");
for (int i = 0; i < textBetweenTags.length; i++) {
    System.out.println(textBetweenTags[i]);
}

回复收藏 0 原文

~没有更多了~