如果 Perl 中的文件中存在特定的正则表达式,如何获取文件名
在包含许多文件的目录中,如果在文件中找到特定的正则表达式(或格式),则获取该文件名
示例:
- 如果在名为 ramayana 的文件中找到单词“rama”,
- 则如果文件包含特定格式,则 打印文件名“ramayana”类似于“(TEXT - NUMBERS)”的文件名称为表,
在 shell 中打印该文件名表非常简单,例如
grep "mytext" * |切 -d':' -f1 | uniq
但如何在 perl 中执行此操作
建议我是否有任何特定的 CPAN 模块对此有帮助
谢谢
In a directory containing many files, if a particular regexp (or format) was found in file, get that filename
Example :
- if word "rama" is found in file called ramayana, print the filename "ramayana"
- if a file contains a particualr format something like "(TEXT - NUMBERS)" in a file name called table, print that filename table
in shell that was pretty easy,something like
grep "mytext" * | cut -d':' -f1 | uniq
but how to do it in perl
Suggest me if any particaular CPAN module helps for this
Thankss
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
你的例子听起来比你的外壳单行更复杂。这相当于 Perl 中的一行:
-n
打开文件并读取内容。-l
添加换行符进行打印(为了方便)。如果找到文本,我们将关闭文件句柄以避免打印多个匹配项。如果要处理更多文件,它将再次打开。Your examples sound way more complicated than your shell one-liner. This is equivalent to your one-liner in perl:
-n
to open files and read content.-l
to add newline to print (for convenience). And if the text is found, we close the file handle to avoid printing multiple matches. It will be opened again if more files are to be processed.我的第一次尝试是:
但是如果它在一个文件中匹配多次,则会打印出多个文件名。为了停止显示欺骗,我尝试使用“last”来打破隐式循环,但它似乎不起作用。因此,上面的示例只要找到匹配项,就会将文件名放入哈希中(if (/PATTERNTOMATCH...),然后在 END 块中(位于开头!),它会打印哈希中的键(以删除重复的文件名)
恐怕有点可怕,我会坚持使用上面 DavidO 提到的 CPAN 模块。
My first attempt was :
but that would print out multiple filenames if it matches more than once in a file. To stop showing the dupes I tried to use 'last' to break out of the implicit loop, but it didn't seem to work. So the top example, puts the filename into a hash whenever it finds a match (if (/PATTERNTOMATCH...) and then in the END block (which is at the beginning!), it prints the keys from the hash (to remove duplicate filenames).
It's a bit horrid I'm afraid. I'd stick with the CPAN module mentioned by DavidO above