使用php确定MySql表字段中的热门关键字

发布于 2024-12-16 22:36:51 字数 305 浏览 0 评论 0原文

我在 MySql 中有一个表,其中包含几个不同的字段,其中一个字段包含可能有几段长的描述。

我正在尝试找出一种方法,让 php 自动遍历这些描述字段并创建最常用关键字的列表。我正在寻找整个表格的热门关键词,而不是单独的每个帖子。

我知道这是一个占用大量资源的操作,而且无论如何也不会经常运行。

但我想得到一个这样的列表:

some x 121
most x 110
frequent x 90
words x 50

这样我就可以看到描述字段中最常用的单词是什么。知道从哪里开始吗?

I have a table in MySql with several different fields, one of them contains a description that could be a couple of paragraphs long.

I am trying to figure out a way to have php automatically go through these description fields and create a list of the top keywords used. I am looking for the top keywords for the entire table, not each post individually.

I know this is a bit of a resource heavy operation, and it wouldn't be run very often anyways.

But I'd like to get a list like this:

some x 121
most x 110
frequent x 90
words x 50

So that I could see what the top used words are in the description field. Any idea at all where to start?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

孤君无依 2024-12-23 22:36:51

您可以运行查询,

  1. 循环遍历记录并将描述一起附加到 1 个大的 happy 字符串中。

  2. 然后,您可以通过 ' ' 分解为数组

  3. 使用 array_count_values() 获取值数组

  4. 按降序重新排序 arsort()

更新

示例代码:

$string = '';
foreach (your_result_set as one_row) 
{
    $string .= $one_row['text'];
}

$data = explode(' ', $string);
$data = array_count_values($data);
arsort($data);

You can run you query,

  1. loop through the records and append descriptions together into 1 big happy string.

  2. Then, you can explode by ' ' into array

  3. Get array of values using array_count_values()

  4. Re-sort in descending order arsort()

Update

Sample code too:

$string = '';
foreach (your_result_set as one_row) 
{
    $string .= $one_row['text'];
}

$data = explode(' ', $string);
$data = array_count_values($data);
arsort($data);
小嗲 2024-12-23 22:36:51

如果您可以控制数据库,一种方法是向该表添加触发器,该表维护另一个包含所有关键字的表。

插入触发器将遍历 new.description 并递增找到的所有关键字

删除触发器将执行相同的操作,但对于 old.description 并递减关键字

更新触发器将执行与删除和插入相同的操作,即减少在 old.description 中找到的所有关键字并增加 new.description。

一旦您完成并尝试这些触发器,就会转储所有数据并重新导入它,以使触发器对所有现有数据执行工作。

If you have control over the database one way would be to add triggers to this table that maintains another table with all keywords.

The insert trigger would go through new.description and increment all keywords found

The delete trigger would do the same but for old.description and decrement the keywords

The update trigger would do the same as delete and insert, ie decrease all found in old.description and increase for new.description.

Once you have done and tried these triggers dump all data and re import it to have the trigger do the work on all existing data.

把梦留给海 2024-12-23 22:36:51

有几种方法可以做到这一点。我猜你不想计算每个单词,因为诸如“and”、“if”、“it”等单词都毫无意义。

还有我们谈论的是多少行?

一个简单的解决方案是创建一个名为words 的数组。循环遍历每一行。
使用“”分解段落,给出每个单词。如果大小写有问题,您可能还希望先执行 str_to_lower 。

循环并使用 array_key_exists 来查看是否有密钥(如果没有创建)。
并添加值 1。否则将该值增加一。

这将为您提供每个单词的计数。

如果这是为了搜索大型数据库,则值得在插入时将关键字添加到单独的表中。
我认为这会很好的一种方法是添加 5 个最常用的单词,排除排除列表中的单词(and、it、or、a、i 等)。并添加关键字表中出现的任何单词。

这方面存在问题。我有这个回复,但没有提到 php、sql 或查询,这些都是与帖子相关的内容。也许值得在插入时添加标签/关键字。

there are a few ways you can do this. i'm guessing you dont want to count every word, as words like and,if,it etc will all be meaningless.

also how many rows are we talking?

a simple solution is to create and array called words. loop through each row.
explode the paragraph using " ", which gives you each word. you may also wish to do a str_to_lower first if case is an issue.

loop through and use array_key_exists to see if there is a key if not create it.
and add a value of one. otherwise incriment the value by one.

this will give you counts of each word.

if this is for a search of a large database it would be worthwhile adding keywords to a seperate table on insert.
one way i think this would be good is to add the 5 most frequent used words excluding those in the exclude list (and,it,or,a,i etc). and add any word that appears in the keyword table.

there are issues with this. i have this response and dont mention php, sql or query which are what the post related to .maybe it would be worth having tags/keywords added on insert.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文