在 Java 中使用 Scanner 时,在一行中匹配标记时出现问题

发布于 2024-07-29 23:47:15 字数 1187 浏览 3 评论 0原文

我需要匹配输入文本行中的某些内容。 这些行看起来像这样:

 to be/ Σ _ Σ  [1pos, 1neg] {0=1, 2=1}

我正在使用 Scanner 类来读取文本的每一行,并且我编写了以下代码。 但是,有些东西无法正常工作,因为模式“to”与该行不匹配,它应该是,因为“to”包含在该行中(我尝试不仅匹配该行中的“to”,但没有匹配):

 Scanner scanner = new Scanner(file);
 while(scanner.hasNext()) {
      String line = scanner.nextLine();
      System.out.println("line: " + line);
      Pattern p_pos = Pattern.compile("to");
      Matcher m_pos = p_pos.matcher(line);
      String match = m_pos.group(0);
      System.out.println("match: " + match);
      boolean b_pos = m_pos.matches();
      if(b_pos) {
          System.out.println(match);
      }
 }

输出:

line:    to be/ Σ _ Σ  [1pos, 1neg] {0=1, 2=1}
Exception in thread "main" java.lang.IllegalStateException: No match found
    at java.util.regex.Matcher.group(Matcher.java:485)
    at lady.PhrasesFromFile.readFile(PhrasesFromFile.java:31)
    at lady.PhrasesFromFile.main(PhrasesFromFile.java:17)

我还有一个问题:如何处理该行,以便存储从行开头到第一个“/”符号的所有内容? 我在 API 中找不到任何方法。 可以这样做吗? 我基本上想连续遍历该行,将该行的各个部分存储在不同的变量中,然后使用这些变量的值。 由于我不知道第一个“/”符号之前有多少个令牌,因此我无法使用 next() 一定次数。

先感谢您。

I need to match certain things from lines of an input text. The lines look like this:

 to be/ Σ _ Σ  [1pos, 1neg] {0=1, 2=1}

I am using the Scanner class to read each line of the text, and I have written the following code. However, something is not working properly, because the patter "to" is not matched against the line, and it should be, because "to" is contained in the line (I have tried to match not only "to" from the line, but nothing matches):

 Scanner scanner = new Scanner(file);
 while(scanner.hasNext()) {
      String line = scanner.nextLine();
      System.out.println("line: " + line);
      Pattern p_pos = Pattern.compile("to");
      Matcher m_pos = p_pos.matcher(line);
      String match = m_pos.group(0);
      System.out.println("match: " + match);
      boolean b_pos = m_pos.matches();
      if(b_pos) {
          System.out.println(match);
      }
 }

Output:

line:    to be/ Σ _ Σ  [1pos, 1neg] {0=1, 2=1}
Exception in thread "main" java.lang.IllegalStateException: No match found
    at java.util.regex.Matcher.group(Matcher.java:485)
    at lady.PhrasesFromFile.readFile(PhrasesFromFile.java:31)
    at lady.PhrasesFromFile.main(PhrasesFromFile.java:17)

I have one more question: how can I process the line so that I store everything from the beginning of the line till the first "/" symbol? I couldn't find any method for that in the API. Is it possible to do so? I basically want consecutively to go through the line, store pieces of the line in different variables, and then use the values of these variables. Since I do not know how many token I have before the first "/" symbol, I cannot use next() a certain number of times.

Thank you in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

一瞬间的火花 2024-08-05 23:47:15

.matches() 尝试匹配整个输入字符串。 如果要匹配输入字符串的一部分,请使用 .find();如果要匹配输入字符串的开头,请使用 .lookingAt()

http://java.sun. com/j2se/1.4.2/docs/api/java/util/regex/Matcher.html

另外,如果您扩展模式以包含匹配组(有关匹配组如何工作的更多详细信息,请参阅常规正则表达式参考) ),您可以在成功匹配后使用 .group() 函数来检索与模式中特定组匹配的子字符串。

.matches() tries to match the entire input string. Use .find() if you want to match a portion of the input string, or .lookingAt() if you want to match the beginning of the input string.

http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Matcher.html

Also, if you expand your pattern to include matching groups (see a general regex reference for more details on how matching groups work), you can use the .group() function after a successful match to retrieve the substring matched by a particular group within the pattern.

他是夢罘是命 2024-08-05 23:47:15

您可以使用以下方法提取令牌所需的部分:

String tokenSection = Pattern.compile("(to\\s+.*?)/").matcher(line).find().group(1);

然后循环使用以提取令牌

Pattern.compile("\\w+").matcher(tokenSection).find();

显然,您不会直接插入上述代码片段。

You could extract the part you need for the tokens by using:

String tokenSection = Pattern.compile("(to\\s+.*?)/").matcher(line).find().group(1);

and then looping over that to extract the tokens using

Pattern.compile("\\w+").matcher(tokenSection).find();

Obviously, you wouldn't plug the above pieces of code right in.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文