如何在 Perl 中匹配两个文档之间的字符串顺序?
我在编写 PERL 程序来匹配两个文档中的单词时遇到问题。假设有文档 A 和 B。
所以我想删除文档 A 中但文档 B 中没有的单词。
示例 1:
A:我吃披萨
B:她去市场吃东西披萨
结果:吃披萨
示例2: A:吃披萨
B:披萨 吃
结果:披萨 (词序相关,所以“吃”被删除。)
我使用 Perl 作为系统,每个文档中的句子数量不多,所以我想我不会使用 SQL
并且该程序是一个子程序印度尼西亚语(印尼语)论文自动评分
Thanx, 抱歉,如果我的问题有点令人困惑。我对“这个世界”真的很陌生:)
I've a problem in making a PERL program for matching the words in two documents. Let's say there are documents A and B.
So I want to delete the words in document A that's not in the document B.
Example 1:
A: I eat pizza
B: She go to the market and eat pizza
result: eat pizza
example 2:
A: eat pizza
B: pizza eat
result:pizza
(the word order is relevant, so "eat" is deleted.)
I use Perl for the system and the sentences in each document isn't in a big numbers so I think I won't use SQL
And the program is a subproram for automatic essay grading for Indonesian Language (Bahasa)
Thanx,
Sorry if my question is a bit confusing. I'm really new to 'this world' :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
好的,我目前无法访问,因此不能保证 100% 甚至编译,但应该提供足够的指导:
解决方案 1:(词序并不重要)
这应该创建一个新的文件“A_new”仅包含 B 中的 A 单词。
这有一个小错误 - 它将用单个空格替换文件 A 中的任何多个空格,因此
将变成
它可以修复,但确实很烦人所以,所以我没有打扰,除非你绝对要求100%正确地保留空格
解决方案2:(词序很重要,但你可以从文件A中打印单词,而不考虑保留空格所有)
解决方案 3(为什么我们再次需要 Perl?:) )
您可以在不使用 Perl 的 shell 中轻松执行此操作(或通过父 Perl 脚本中的 system() 调用或反引号)
从 Perl 调用它:
但是请参阅我的最后评论,为什么这可能被认为是“糟糕的 Perl”...至少如果您在循环中执行此操作并且迭代很多文件并且关心性能的话。
OK, I'm without access at the moment so this is not guaranteed to be 100% or even compile but should provide enough guidance:
Solution 1: (word order does not matter)
This should create a new file "A_new" that only contains A's words that are in in B.
This has a slight bug - it will replace any multiple-whitespace in file A with a single space, so
will become
It can be fixed but would be really annoying to do so, so I didn't bother unless you will absolutely require that whitespace be preserved 100% correctly
Solution 2: (word order matters BUT you can print words from file A out with no regards for preserving whitespace at all)
Solution 3 (why do we need Perl again? :) )
You can do this trivially in shell without Perl (or via system() call or backticks in parent Perl script)
To call this from Perl:
But see my last comment why this may be considered "bad Perl"... at least if you do this in a loop with very many files being iterated and care about performance.