从数组中查找一个值及其在 PHP 文本文件中对应的键
我有一个相当大的 txt 文件 (3.5 MB),结构如下:
sweep#1 expanse#1 0.375
loftiness#1 highness#2 0.375
lockstep#1 0.25
laziness#2 0.25
treponema#1 0.25
rhizopodan#1 rhizopod#1 0.25
plumy#3 feathery#3 feathered#1 -0.125
ruffled#2 frilly#1 frilled#1 -0.125
fringed#2 -0.125
inflamed#3 -0.125
inlaid#1 -0.125
每个单词后面跟着一个 #
、一个整数,然后是它的“分数”。单词和乐谱之间有制表符。截至目前,文本文件是使用 file_get_contents() 作为字符串加载的。
从由单独的、小写的、去除字符的单词组成的字符串数组中,我需要查找每个值,找到其相应的分数并将其添加到运行总计中强>。
我想我需要某种形式的正则表达式来首先找到该单词,继续到下一个 \t
并然后将整数添加到运行总计中。解决这个问题的最佳方法是什么?
I have a sizeable txt file (3.5 MB) structured like so:
sweep#1 expanse#1 0.375
loftiness#1 highness#2 0.375
lockstep#1 0.25
laziness#2 0.25
treponema#1 0.25
rhizopodan#1 rhizopod#1 0.25
plumy#3 feathery#3 feathered#1 -0.125
ruffled#2 frilly#1 frilled#1 -0.125
fringed#2 -0.125
inflamed#3 -0.125
inlaid#1 -0.125
Each word is followed by a #
, an integer and then its "score." There are tab breaks in between the word and score. As of right now, the textfile is loaded as a string using file_get_contents()
.
From an array of strings made up of individual, lower-case, character-stripped words, I need to look up each value, find its corresponding score and add it to a running total.
I imagine I would need some form of regex to first find the word, continue to the next \t
and then add the integer to a running total. What's the best way of going about this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
是的,可能有更好的方法来做到这一点。但这是如此简单:
Yes, there are probably better ways of doing this. But this is so oh-so-simple:
如果您只需要查找一个单词,那么就很简单:
^
以/m
模式查找行的开头,而#
> 和\t
是文字分隔符,而\d+
匹配小数。结果组[1]
将是您的浮点数。$word
需要转义 (preg_quote),因为它本身可能包含/
正斜杠。要一次性搜索多个单词,请将它们内爆为替代列表$word1|$word2|$word3
,添加捕获组,然后使用preg_match_all
代替。If you just need to find a word, then it's as simple as:
^
looks for the start of a line in/m
mode, and#
and\t
are literal separators, while\d+
matches decimals. The result group[1]
will be your float number.The
$word
needs escaping (preg_quote) could it potentially contain a/
forward slash itself. To search multiple words in one go implode them as alternatives list$word1|$word2|$word3
, add a capture group, and usepreg_match_all
instead.