“建议”是指表结构

发布于 2024-10-19 12:13:47 字数 534 浏览 1 评论 0原文

您知道人们如何寻找类似或推荐的电影或电视节目吗?

http://www.anime-planet.com/anime/devil-may-cry

看看它在底部如何分享推荐的动漫,并且它们作为推荐相互链接,因此如果您在 B 上链接 A 到 B,您也可以看到 A,并且 A 到 BB 到 C
C -not- A

我的问题是如何最好地处理这些条目?

Listings_Table

  • list_id

  • list_title

  • list_content

推荐_Table

  • list_id_A

  • list_id_B

通过此方法我认为会导致很多重复,我认为查询也会有点混乱。任何建议表示赞赏。

You know how people look for movies or TVshows that are similar or recommended?

http://www.anime-planet.com/anime/devil-may-cry

see how at the bottom it shares recommended animes, and they are interlinked to each other as recommendations so if you link A -to- B on B you can see A too, and
A -to- B
B -to- C
C -not- A

my question is how are theses entries best handled?

Listings_Table

  • list_id

  • list_title

  • list_content

Recommends_Table

  • list_id_A

  • list_id_B

though this method would cause duplicates a lot i think, queries would be a bit messy too I think. Any advice is appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

寂寞清仓 2024-10-26 12:13:47

你指的是一个相对简单的推荐引擎。对于手动分配建议(其中 A 指向 B,B 指向 C)的小型数据集来说,它可以很好地工作,但它不是一种可扩展的方法。一旦你使用的产品数量超过了微不足道的数量,它就会变得太难以维护(在我看来)。

您可能会发现使用更复杂的东西对您更有利。看看人们如何使用 Google 的 Prediction api (http://code.google.com/apis/predict/docs/samples.html#demos) 之类的东西来进行这种预测。在这种情况下,您不会存储实际的链接,而是存储用户喜欢什么,然后查看该信息来构建您的推荐。

虽然预测 API 并不是适合所有人的完美解决方案,但它将为您提供一种非常简单的方法来构建网站的推荐部分,而无需深入学习机器学习技术。

至于上面的表结构,如果您确实这样做,则不需要复制数据。相反,我建议您在推荐表的两列中查找您所在的元素。

例如,如果“Recommendations_table”中有以下记录,

list_id_A, list_id_B
1, 2
9, 12
2, 3

您可以通过使用联合查询来获取与“2”相关的所有内容,例如:

select list_id_A from recommendations_table where list_id_B = 2
union
select list_id_B from recommendations_table where list_id_A = 2

此外,您可以添加更多 sql 以确保只返回唯一结果。但最终,您如何填充这些信息,因为这可能比任何事情都更重要。

如果您想更进一步并使用不同的技术,例如像 Cassandra 这样的 nosql 数据存储,您可以有一个称为推荐的列族,并且您的键将是您正在观看的电影。那么后续的列名称将是推荐的电影 ID。在这种情况下,您的结构将是这样的:

Key, columns.....
Movie A, 4, 5, 67,1, 9,3
Movie B, 3, 4, 1

在这种情况下,您将提取特定键的所有列名称,这将是您的推荐列表。

所有这些实际上都是学术性的,而不知道您打算如何填充数据。

What you are referring to is a relatively simple recommendation engine. It would work fine for a small dataset where you are manually assigning the recommendations, where A points to B and B points to C, however its not a very scalable approach. Once you hit any more than a trivial amount of products it becomes too unwieldy to maintain (in my opinion).

What you may find serves you better is to use something a little more sophisticated. Take a look at how people use something like Google's Prediction api (http://code.google.com/apis/predict/docs/samples.html#demos) to do just this very sort of prediction. In that case you wouldn't be storing the actual linkage, but rather what users liked what, and then looking at that information to build your recommendations.

While the Prediction API isn't a perfect solution for all people, it will give you a pretty easy way to build out a recommendations portion of your site without having to learn machine learning techniques in depth.

As for your table structure above, you wouldn't need to duplicate data if you DID do it like that. Instead what I would suggest is that you look for the element you're on in both of the columns in the recommends_table.

For example, if you have the following records in "Recommendations_table"

list_id_A, list_id_B
1, 2
9, 12
2, 3

You could grab everything related to "2" by using a query that unions, such as:

select list_id_A from recommendations_table where list_id_B = 2
union
select list_id_B from recommendations_table where list_id_A = 2

Additionally you could add some more sql to make sure you only return unique results. But in the end, how do you populate that information, as that is likely to make more of a difference than anything.

If you would to go a step further and use a different technology such as a nosql data store like Cassandra, you could have a column family called recommendations, and your key would be the movie you are viewing. Then the subsequent column names would be the recommended movie ids. In that case you would have something like this for the structure:

Key, columns.....
Movie A, 4, 5, 67,1, 9,3
Movie B, 3, 4, 1

In that case you would pull all the column names for a particular key and that would be your recommendation list.

All of this is really kind of academic without knowing how you plan to populate the data.

夢归不見 2024-10-26 12:13:47

如果 (list_id_A, list_id_B) 是 Recommends_Table 的主键,则不会有任何重复项。另外,如果您希望链接是双向的,那么当将新行插入 Recommends_Table 时,例如 (A, B),您还必须插入 (B, A)。在这种情况下,触发器会有所帮助。

或者,您可以仅插入 (A, B) 或仅插入 (B, A) 并使用 dmcnelis 建议的查询:

select list_id_A from recommendations_table where list_id_B = 2
union
select list_id_B from recommendations_table where list_id_A = 2

我认为替代解决方案更好,因为您将在 Recommends_Table 中存储更少的数据。但是,在这种情况下,如果表中已经有 (A, B) 行,那么再插入 (B, A) 是没有用的。为了防止这种情况,您可以再次使用触发器。

You won't have any dublicates if (list_id_A, list_id_B) is a primary key of Recommends_Table. Also, if you want the links to be two-way, then when inserting a new row into Recommends_Table, say (A, B), you'll have to also insert (B, A). Triggers would help in this case.

Alternatively, you could insert only (A, B) or only (B, A) and use the query that dmcnelis suggested:

select list_id_A from recommendations_table where list_id_B = 2
union
select list_id_B from recommendations_table where list_id_A = 2

I think the alternative solution is better, because you'll have less data to store at the Recommends_Table. But, in this case, if you already have a (A, B) row in the table, then it would be useless to also insert a (B, A). In order to prevent this, you can use triggers, again.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文