请帮助破译这个 lisp 摘录
(let ((g (* 2 (or (gethash word good) 0)))
(b (or (gethash word bad) 0)))
(unless (< (+ g b) 5)
(max .01
(min .99 (float (/ (min 1 (/ b nbad))
(+ (min 1 (/ g ngood))
(min 1 (/ b nbad)))))))))
(let ((g (* 2 (or (gethash word good) 0)))
(b (or (gethash word bad) 0)))
(unless (< (+ g b) 5)
(max .01
(min .99 (float (/ (min 1 (/ b nbad))
(+ (min 1 (/ g ngood))
(min 1 (/ b nbad)))))))))
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
问题是什么?这几乎是简单的英语:
让
g
为哈希表good
中word
的值(如果不存在则为 0)乘以 2和 < code>b 哈希表
bad
中word
的值(如果不存在则为 0)。考虑到这一点,并假设
g
和b
之和不小于 5,则返回最大值 0.01 或
最小值 0.99 或
b
/nbad
除以b
/nbad
和g
/之和ngood
(作为浮点值,这些单独的商最多应为 1)。What is the problem? It is almost plain english:
Let
g
be the value ofword
in the hashtablegood
(or 0 if not existent there) times 2and
b
the value ofword
in the hashtablebad
(or 0 if not existent there).With this in mind, and under the presumption that the sum of
g
andb
is not smaller than 5return the maximum of either 0.01 or
the minimum of either 0.99 or
b
/nbad
divided by the sum ofb
/nbad
andg
/ngood
(as a float value, and those individual quotients should be at most 1).看起来它正在尝试根据哈希表
good
和bad
中是否存在word
来计算分数。如果该单词不存在于哈希表中,则其值为 0;否则,如果该单词存在于好表中,则其权重为 2(加倍)。
如果分数小于 5,请按如下方式计算分数(
除非
以下的部分):我不确定
ngood
和nbad
是什么,但是那么 n 向我表明它们可能是计数。代码看起来也将计算出的分数保持在 5 以下。在分数计算中,分母也将保持在最大值 2,并将分数的下限保持在 0.5。根据您使用的标签,我猜测(这只是猜测)它正在尝试根据好与坏电子邮件中单词的某种频率(?)计数来计算单词的权重。
Looks like it is trying to calculate a score based on the presence of
word
in the the hash tablesgood
andbad
.If the word does not exist in a hash table it is given a value of 0, otherwise if it exists in the good table it is weighted by 2 (doubled).
If the score is less than 5 calculate the score (portion below
unless
) as follows:I'm not sure what
ngood
andnbad
are but then n indicates to me they are probably counts. It also looks like the code is keeps the calculated score below 5. It also looks like in the score calculation the denominator will be kept to a maximum 2 keep the lower bound of the score to 0.5.Based on the tags you've used, I would guess (and it is just a guess) that it is trying to calculate a weighting for word based on some kind of frequency(?) counting of the word in good versus bad email.