从文本中提取关键字和多字关键字 - PHP
我想知道是否有人知道从 PHP 文本块中提取最常出现的关键字/短语的最佳方法。
我想为我正在开发的应用程序构建自己的标签云。主要棘手的部分是提取“多单词”关键字,例如“白宫”,并且不将它们识别为两个单独的单词,而是一个短语。
肯定有很多用于此目的的脚本,但似乎找不到任何脚本!
感谢您的帮助!
I was wondering If anyone knows the best way to pull out the top reoccurring keywords/phrases from a block of text in PHP.
I want to build my own tag cloud for an application I'm working on. The main tricky part would be pulling out 'muli-word' keywords such as "White House" and not recognising them as two separate words but one phrase.
There must be a bunch of scripts out there for this purpose, just can't seem to find any!
Appreciate your help!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这是我使用的一小块 - 它解析逗号分隔的字符串,并相应地打印大小:
PHP
HTML
这只是我用过的东西,但我想我会分享——也许对你有帮助。
编辑:对于两个单词的标签,您可以执行“White-House”之类的操作,然后在回显时删除破折号。只是另一个想法。
Here's a little chunk I used - it parses a comma-delimited string, and prints the size accordingly:
PHP
HTML
It's just something I've used, but thought I'd share- maybe it helps you.
Edit: For two-word tags, you could just do something like "White-House" and then remove the dash when you're echoing. Just another thought.