文件中最常出现的单词的数组,以换行符分隔。

发布于 2024-11-27 03:55:25 字数 259 浏览 2 评论 0原文

如何从由换行符 (\n) 分隔的文件中获取最常出现的单词数组?

示例文件:

person
person
dog
cat
person
lemon
orange
person
cat
dog
dog

文件中的单词没有特定的顺序。

我怎样才能让它表现得像下面这样?

回显$顶部[0]; //输出:人
回显 $top[1]; //输出:狗
等等...

提前致谢!

How would I get an array of the top occurring words out of a file separated by newlines (\n)?

Example file:

person
person
dog
cat
person
lemon
orange
person
cat
dog
dog

The words in the file are in no particular order.

How would I make it behave like the following?

echo $top[0]; //output: person
echo $top[1]; //output: dog
etc...

Thanks in advance!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

洛阳烟雨空心柳 2024-12-04 03:55:25
$lines = file("theFile.txt");
var_dump(array_count_values($lines));

http://php.net/array_count_values

演示:http://ideone.com/zd82W

要从结果数组中获取第一个元素(出现次数最多的单词),您可以执行以下操作:

$arr = array("person", "person", "cat", "dog", "cat");
$newArr = array_count_values($arr);
echo key($newArr); // "person"

演示:http://ideone.com/A0WPa

$lines = file("theFile.txt");
var_dump(array_count_values($lines));

http://php.net/array_count_values

Demo: http://ideone.com/zd82W

To get the first element (word which occurs the most) from the resulting array, you can do this:

$arr = array("person", "person", "cat", "dog", "cat");
$newArr = array_count_values($arr);
echo key($newArr); // "person"

Demo: http://ideone.com/A0WPa

南烟 2024-12-04 03:55:25

我可能会使用这样的方法:

  • 逐行读取文件,每次检测到单词时向数组项添加+1,计算每个单词对该数组进行排序的次数

没有真正测试过,但我想这样的东西应该有效:

(如果您的文件很大:不需要将整个文件加载到内存中)

$words = array();

$f = fopen('your_file', 'r');
while ($line = fgets($f)) {
    $word = trim($line);
    if (isset($words[$words])) {
        $words[$words]++;
    }
    else {
        $words[$words] = 1;
    }
}

asort($words);

现在,$words 数组中的第一个键是最常用的单词——对应的值是单词的数量在您的文件中出现过多次。

I would probably use something like this :

  • read the file line by line, adding +1 to an array item each time a word is detected, counting for each word how many times it's been seen
  • sorting that array.

Not really tested, but something like this should work, I suppose :

(should work better than array_count_values() if your file is big : no need to load the whole file into memory)

$words = array();

$f = fopen('your_file', 'r');
while ($line = fgets($f)) {
    $word = trim($line);
    if (isset($words[$words])) {
        $words[$words]++;
    }
    else {
        $words[$words] = 1;
    }
}

asort($words);

Now, the first key in the $words array is the most used word -- and the corresponding value is the number of times it's been seen in your file.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文