regexec 和 regcomp 比我自己做 strncmp 更有效吗?
我有一个像这样的字符串:
I am down in the town seeing a crown="larry" with a cherry="red"
我想编写一个程序来询问用户她想要什么。如果她请求的字符串应该有“larry”作为皇冠和“red”樱桃,我需要返回该字符串。
好吧,我这里的问题过于简单化了。可能有很多这样的字符串,我需要解析它们并返回所有匹配的字符串。
问题:执行 regexec 和 regcomp 更有效,还是分解字符串并执行 strncmp?
PS:似乎 regexec 需要在内部进行某种比较,而这些比较的设计会非常有效。
I have a string like this:
I am down in the town seeing a crown="larry" with a cherry="red"
I want to write a program that asks user what she wants. If she requests the string that should have "larry" as crown and "red" cherry, I need to return the string.
Okay, I am over simplifying the problem here. There can be many such strings and I need to parse through them and return all that matches.
Question: doing regexec and regcomp is more efficient or breaking down the string and doing strncmp?
PS: It seems that regexec would need to do some sort of comparison internally and those would have been designed to be much efficient.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我认为 strncmp() 根本就是不适合这项工作的工具;如果你说
strstr()
,可能还有讨论的余地。您不能轻松使用strncmp()
因为您必须找到一个位置来开始比较。如果您使用
strstr()
,您将寻找如下字符串:如果您使用正则表达式,则必须编译它并运行它。如果您正在搜索两个字符串,则您有两个正则表达式,除非您想编写一个扭曲的正则表达式。我认为,对于需要按任一顺序使用上述两个字符串的简单比较,您可能会发现两次使用
strstr()
比使用一两个正则表达式更快。不过,值得衡量一下差异。它可能取决于
strstr()
的实现;有些非常好。因此,在您关心的平台上进行测量,然后选择更适合您的平台。I think
strncmp()
is simply the wrong tool for the job; if you'd saidstrstr()
, there might have been room for discussion. You can't usestrncmp()
easily because you have to find a position to start it comparing at.If you used
strstr()
, you'd be looking for strings such as:If you use a regex, you have to compile it, and run it. If you are searching for the two strings, you have two regexes, unless you want to write a contorted regex. I think that for simple comparisons where you need both the strings above in either order, you might find two uses of
strstr()
quicker than one or two regexes.It is worth measuring the difference, though. It may depend on the implementation of
strstr()
; some are very good. So, run measurements on the platforms you are concerned with, and choose which works better for you.由于您每次执行
regexec()
时可能都会编译一个新的正则表达式,因此这可能会比使用strncmp()
检查关键字慢一点,例如“crown=”,然后检查该值是否为“\”larry\“”。我假设您可以构建一个系统,预先解析关键字和值,并保留某种列表、字典或指向字符串的某种类型,反之亦然(每个字符串与一组关键字=“值”组合相关联)。这可以完成一次,并且将使搜索过程中的工作变得更容易。
但我对您的目标和现有代码了解不够,不知道这对您的情况是否有意义。
换句话说,您必须对其进行分析才能确定,但我猜
strncmp()
会比regcomp()
和regexec( )
组合。当然,正则表达式要灵活得多,但我认为您在这里不需要它。加法
假设“=”不是在您的行中经常出现的字符,您当然可以使用
strchr()
来查找每个出现的“=”字符串中,然后检查下一个字符是否为 '\"'。然后您可以向后扫描以查看键是否匹配。strchr()
很可能比strncmp 快很多()
。Since you are probably compiling a new regex each time you'll do a
regexec()
, that will probably be a bit slower than usingstrncmp()
to check for the keyword, e.g. "crown=" and then checking if the value is "\"larry\"".I assume you could build a system that parses the keywords and values beforehand and keeps some kind of list, dictionary or some such pointing to the string, or vice versa (each string is associated with a set of keyword="value" combinations). That could be done once, and would making the work during search easier.
But I don't know enough of your goals and your existing code to know if that makes sense for your situation.
In other words, you would have to profile this to be sure, but I guess that
strncmp()
would be more performant than theregcomp()
andregexec()
combinations. Regular expressions are, of course, far more flexible, but I don't think you need that here.Addition
Assuming that '=' is not a character that will be found in your lines very often, you can of course use
strchr()
to find each occurrence of '=' in the string, and then check if the next character is '\"'. Then you can scan backward to see if the key matches.strchr()
is very likely a lot faster thanstrncmp()
.