标签云算法的建议
我有一个 MSSQL 2005 表:
[Companies](
[CompanyID] [int] IDENTITY(1,1) NOT NULL,
[Title] [nvarchar](128),
[Description] [nvarchar](256),
[Keywords] [nvarchar](256)
)
我想为这家公司生成一个标签云。但我已将所有关键字保存在一列中,并用逗号分隔。有关如何通过最常用的关键字生成标签云的任何建议。可能有数百万家公司,每个公司大约有 10 个关键字。
谢谢。
I have a MSSQL 2005 table:
[Companies](
[CompanyID] [int] IDENTITY(1,1) NOT NULL,
[Title] [nvarchar](128),
[Description] [nvarchar](256),
[Keywords] [nvarchar](256)
)
I want to generate a tag cloud for this companies. But I've saved all keywords in one column separated by commas. Any suggestions for how to generate tag cloud by most used keywords. There could be millions of companies approx ten keywords per company.
Thank you.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
第一步:将关键词拆分成适当的关系(表)。
步骤 2:将公司和标签之间的多对多关系映射到单独的表中,就像所有多对多关系一样:
步骤 3:使用简单的 GROUP BY 查询来生成“云”(以“云”为例) cloud' 表示最常见的 100 个标签):
第 4 步:缓存结果,因为它很少更改且计算成本很高。
Step 1: separate the keywords into a proper relation (table).
Step 2: Map the many-to-many relation between companies and tags into a separate table, like all many-to-many relations:
Step 3: Use a simple GROUP BY query to generate the 'cloud' (by example taking the 'cloud' to mean the most common 100 tags):
Step 4: cache the result as it changes seldom and it computes expensively.
我更愿意看到您的设计按照Remus的建议进行标准化,但是如果您无法更改设计...
您可以使用解析函数(我将使用的示例取自 此处),解析您的关键字并对其进行计数。
I'd much rather see your design normalized as suggested by Remus, but if you're at a point where you can't change your design...
You can use a parsing function (the example I'll use is taken from here), to parse your keywords and count them.
莱姆斯和乔都是正确的,但正如乔所说,如果你没有选择,那么你就必须忍受它。我想我可以通过使用 XML 数据类型为您提供一个简单的解决方案。您已经可以通过执行此查询轻松查看已解析的列
,现在知道您可以做到这一点,您所要做的就是对它们进行分组并进行计数,但是您不能对 XML 方法进行分组,因此我的建议是创建上面查询的视图
并无论如何
,这里是完整的文章 -->http://anyrest.wordpress.com/2010/08/13/converting-parsing-delimited-string-column-in-sql-to-rows/
Both Remus and Joe are correct but yes as what Joe said if you dont have a choice then you have to live with it. I think I can offer you an easy solution by using an XML Data Type. You can already easily view the parsed column by doing this query
now knowing that you can do that, all you have to do is to group them and count, but you cannot group XML methods so my suggestion is create a view of the query above
and perform your count on that view
Anyways here is the full article -->http://anyrest.wordpress.com/2010/08/13/converting-parsing-delimited-string-column-in-sql-to-rows/