如何将数组中的行与掩码数组相匹配?
我有这样的数组:
array('1224*', '543*', '321*' ...)
其中包含大约 17,00 个“掩码”或前缀。
我有第二个数组:
array('123456789', '123456788', '987654321' ....)
其中包含大约 250,000 个数字。
现在,如何使用掩码/前缀数组有效地匹配第二个数组中的每个数字?
[编辑]
第一个数组仅包含前缀,每个条目末尾只有一个 *
。
I have array like this:
array('1224*', '543*', '321*' ...)
which contains about 17,00 "masks" or prefixes.
I have a second array:
array('123456789', '123456788', '987654321' ....)
which contain about 250,000 numbers.
Now, how can I efficiently match every number from the second array using the array of masks/prefixes?
[EDIT]
The first array contains only prefixes and every entry has only one *
at the end.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
嗯,这是一个解决方案:
初步步骤:
*
的部分。搜索:
number
匹配的条目(二分查找)。first
和last
之间搜索(二分搜索)。这应该是
O(k*n*log(n))
,其中n
是平均数字长度(以数字为单位),k
是数字数字。基本上,这是一个一维 Radix 树,为了获得最佳性能,您应该实现它,但它可以是相当难。
Well, here's a solution:
Prelimary steps:
*
's.Searching:
number
(binary search).first
andlast
(binary search).This should be
O(k*n*log(n))
wheren
is the average number length (in digits) andk
the number of numbers.Basically this is a 1 dimensional Radix tree, for optimal performance you should implement it, but it can be quite hard.
我的两分钱......
为了提高性能,首先对两个数组进行排序并在内部
foreach
循环中添加 exit 子句可能是个好主意。顺便说一句,
startWith
函数来自 SO 中的这个出色的解决方案: PHP 中的startsWith() 和endsWith() 函数My two cents....
To improve performance it might be a good idea to sort both arrays first and to add an exit clause in the inner
foreach
loop.By the way, the
startWith
-function is from this great solution in SO: startsWith() and endsWith() functions in PHP另一种选择是在循环中使用 preg_grep :
不知道这有多有效,只是将其作为替代方案提供。
Another option would to be use preg_grep in a loop:
No idea how efficient this would be, just offering it up as an alternative.
尽管正则表达式并不以速度快而闻名,但我想知道如果将模式简化为其最简洁的形式并且仅调用一次,则
preg_grep()
的执行效果如何(不在循环中)。通过去除被较短掩模覆盖的较长掩模,图案将大大缩小。会减少多少?当然,我不能肯定地说,但是有17,000个口罩,肯定会有相当多的冗余。
代码:(演示)
输出:
Although regex is not famous for being fast, I'd like to know how well
preg_grep()
can perform if the pattern is boiled down to its leanest form and only called once (not in a loop).By removing longer masks which are covered by shorter masks, the pattern will be greatly reduced. How much will the reduction be? of course, I cannot say for sure, but with 17,000 masks, there are sure to be a fair amount of redundancy.
Code: (Demo)
Output:
查看 PHP 函数 array_intersect_key。
Check out the PHP function array_intersect_key.