来自 mysql 数据库的关键字/短语密度
我有一个 LAMP 设置,其中 mysql 数据库本质上是一个产品目录。由于数据库随着新产品的添加而频繁更改,因此手动维护关键字和流行短语列表非常麻烦。保留关键字/短语列表的需要有两个:(1)用于谷歌广告词和其他营销活动,(2)用于我网站上的链接结构。
我一直使用 Zend Lucene 端口作为我网站上所有搜索的支柱。是否可以使用 Lucene 执行诸如确定关键字密度和/或短语密度之类的操作?另一个搜索引擎怎么样?
为了进一步明确我正在寻找的内容,假设我有一个笔记本电脑目录。我可能有各种型号的 Dell Inspiron、Dell Latitude、Macbook、Gateway、Lenovo 和 Acer 笔记本电脑。对于关键字密度报告,我希望看到“笔记本电脑”和“笔记本电脑”等词很受欢迎,也许还有“戴尔灵越”或“戴尔灵越笔记本电脑”或“联想笔记本电脑”。
有人可以推荐一些入门的东西吗?我正在关注整个搜索模块世界,如 Lucene、Sphinx、Solr 等,因为它已经对数据进行了索引,但我不知道我是否走错了路。
谢谢!
I have a LAMP setup, with the mysql database essentially being a catalog of products. Since the database changes frequently as new products are added, it's cumbersome to manually maintain a list of keywords and popular phrases. The need to keep a keyword/phrase list is twofold: (1) for google adwords and other marketing initiatives, and (2) for link structure on my site.
I've been using the Zend Lucene port as the backbone for all searching on my site. Is it possible to do things like determine keyword density and/or phrase density using Lucene? What about another search engine?
For further clarity of what I'm looking for, let's say I have a catalog of laptops. I might have various models of Dell Inspiron, Dell Latitude, Macbook, Gateway, Lenovo, and Acer laptops. For a keyword density report, I'd like to see that the words "laptop" and "notebook" are popular, as well as perhaps "Dell Inspiron" or "Dell Inspiron laptops" or "Lenovo laptops."
Can anyone recommend something to get started? I'm sorta eying the whole search module world like Lucene, Sphinx, Solr, etc. since it's already indexing data, but I don't know if I'm going down the wrong path.
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
Lucene 能够为您提供(关键字、频率)对的列表。请参阅此问题,或这篇博文。
Lucene is capable of giving you a list of (keyword, frequency) pairs. See this question, or this blog post.