创建 Java 程序来搜索文件中的特定单词
我刚刚学习该语言,想知道更有经验的 Java 程序员在以下情况下会做什么?
我想创建一个java程序,它将搜索指定文件中特定单词的所有实例。
你会如何处理这个问题,Java API 是否附带一个提供文件扫描功能的类,或者我是否必须编写自己的类来执行此操作?
感谢您的任何意见,
多姆.
I am just learning that language and was wondering what a more experience Java programmer would do in the following situation?
I would like to create a java program that will search a specified file for all instanced for a specific word.
How would you go about this, does that Java API come with a class that provides file scanning capabilities or would i have to write my own class to do this?
Thanks for any input,
Dom.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
java API 确实提供了
java.lang. util.Scanner
类允许您扫描输入文件。然而,根据您打算如何使用它,这可能不是最好的主意。文件很大吗?您是否只搜索一个文件,或者是否尝试保留多个文件的数据库并在其中搜索文件?在这种情况下,您可能需要使用更充实的引擎,例如 lucene 。
The java API does offer the
java.util.Scanner
class which will allow you to scan across an input file.Depending on how you intend to use this, however, this might not be the best idea. Is the file very large? Are you searching only one file or are you trying to keep a database of many files and search for files within that? In that case, you might want to use a more fleshed out engine such as lucene.
除非文件非常大,否则要
查找单词之间的所有文本,您可以使用 split() 并使用字符串的长度来确定位置。
Unless the file is very large, I would
To find all the text between your word you can use split() and use the length of the strings to determine the position.
正如其他人指出的那样,您可以使用 Scanner 类。
我将您的问题放入文件
data.txt
中,并运行以下程序:输出为:
搜索到的模式,
(?i)\bjava\b
,意思如下:(?i)
打开不区分大小写开关\b
表示单词边界java
是搜索到的字符串\b
又是一个单词边界。如果搜索词来自用户,或者由于其他原因可能包含特殊字符,我建议您在字符串周围使用
\Q
和\E
,因为它引号之间的所有字符(如果您真的很挑剔,请确保输入不包含\E
本身)。As others have pointed out, you could use the
Scanner
class.I put your question in a file,
data.txt
, and ran the following program:The output is:
The pattern searched for,
(?i)\bjava\b
, means the following:(?i)
turn on the case-insensitive switch\b
means a word boundryjava
is the string searched for\b
a word boundry again.If the search term comes from the user, or if it for some other reason may contain special characters, I suggest you use
\Q
and\E
around the string, as it quotes all characters in between, (and if you're really picky, make sure the input doesn't contain\E
itself).