当前位置：文江博客话题详情

编写一个统计词频的书签

发布于 2024-09-14 18:19:33 字数 173 浏览 14 评论 0原文

我想创建一个书签，计算网页上的所有文本，然后在绝对定位的 div 中显示从最多到最少的结果。

我所做的每次谷歌搜索都会讨论计算表单或文本区域或已知 div id 中的单词总数。那不是我想要的。我想要每个 /w 在整个网页上出现的次数。

我知道足够的 javascript 知道我不知道如何做到这一点。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

佞臣 2024-09-21 18:19:33

像这样的东西应该有效：

function countWordFrequency() {
  var freq={};
  // Traverse the DOM looking for text nodes.
  recurseTextNodes(function(textNode) {
    // Split the text into words, removing punctuation.
    var words = textNode.data.replace(/[^\w\s]/g, '').split(/\s+/)
      , len = words.length;
    // Count the word frequency.
    for (var i=0; i<len; i++) {
      // if (freq[words[i]]) { bug if one of the words is "constructor"!
      if (typeof freq[words[i]] === 'number') {
        freq[words[i]] += 1;
      } else  {
        freq[words[i]] = 1;
      }
    }
  });
  return freq;
}

这个解决方案可能过于简单，因为它删除了标点符号并解析单词，但应该演示这个想法。另外，recurseTextNodes 函数留给读者作为练习 =)。如何将此例程存储为书签（尤其是如何向最终用户显示结果）也有影响，但我再次假设您已经了解如何做到这一点。

Something like this should work:

function countWordFrequency() {
  var freq={};
  // Traverse the DOM looking for text nodes.
  recurseTextNodes(function(textNode) {
    // Split the text into words, removing punctuation.
    var words = textNode.data.replace(/[^\w\s]/g, '').split(/\s+/)
      , len = words.length;
    // Count the word frequency.
    for (var i=0; i<len; i++) {
      // if (freq[words[i]]) { bug if one of the words is "constructor"!
      if (typeof freq[words[i]] === 'number') {
        freq[words[i]] += 1;
      } else  {
        freq[words[i]] = 1;
      }
    }
  });
  return freq;
}

This solution might be overly simple in the way that it removes punctuation and parses words but should demonstrate the idea. Also the recurseTextNodes function is left as an exercise to the reader =). There are also implications of how to store this routine as a bookmarklet (esp. how to display the results to the end-user) but again, I'll assume you've got some idea of how to do that.

回复收藏 0 原文

~没有更多了~