关于数据库结构和许多功能的问题 ala Reddit
我有一个问题,像 Reddit 这样的书签网站如何管理用户记录的所有“投票”。
例如,当我(User_ID_292929)逻辑上对数据库中的某个帖子(Post_ID_282828)投票时,它说User_ID_292929已对Post_ID_282828投票。
但在数据库中如何将其结构化呢?处理用户配置文件的表是否有一个充满逗号分隔值的字段,并且它会被分解并检查以查看正在加载的页面上的帖子是否已被投票?
我不是在寻找长答案,而是在寻找类似结构的示例程序或文档。
谢谢
I have a question about how a bookmarking site like Reddit would manage all the 'votes' a user has logged.
For example when I (User_ID_292929) vote up a post (Post_ID_282828) logically somewhere in a database it says that User_ID_292929 has voted up Post_ID_282828.
But how would that be structuralized in the DB? Would the table that handels user profiles have a a field that is full of comma seperated values and it gets exploded and checked to see if the posts on the page that's being loaded has been voted up?
I'm not looking for a long answer but more a example program or documentation on a similar structure.
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
假设用户只能为特定帖子投票一次,那么您可以创建一个包含 2 列(user_id 和 post_idusers_vote_posts)强>)。将 user_id 和 post_id 设置为复合主键。
以您的示例为例,假设用户 (User_ID_292929) 对帖子 (Post_ID_282828) 进行了投票。该表看起来像这样:
如果有多种投票类型(例如赞成或反对),那么您可以添加另一列来定义投票类型(我们将其称为vote_type)。
现在表格看起来像这样:
Assuming a user can only vote a particular post once, then you could create a new table (let's call it users_vote_posts) with 2 columns (user_id and post_id). Set both user_id and post_id as a composite primary key.
Using your example, let's say a user (User_ID_292929) votes up a post (Post_ID_282828). The table would look like this:
If there are more than one type of vote (either vote up OR down for example) then you could add another column that defines the type of vote (let's call it vote_type).
Now the table would look like this:
最简单的方法是创建一个表,其中一列用于跟踪投票的用户,另一列包含他们投票的事物的 ID。如果 id 在所有类型中不是唯一的,您还可能有第三列指定他们投票的事物类型。
The simplest way is to have a table with one column to track the user who voted, and another column with the id of the thing they voted on. You may also have a third column specifying the type of thing they voted on, if the id wasn't unique across all types.
这就是所谓的多值属性。发生这种情况时,您将有一个单独的表来指定您需要的信息。所以,你可以有userid,postid。这将是该表的主键,因为 userid 和 postid 一起是唯一的,因此数据库中不会有重复或错误。如果您需要有关帖子的更多信息,您始终可以在查询中使用连接运算符来获取有关帖子或用户的更多信息。
此外,由于表较小,您可以缓存它以加快访问速度,像 reddit 这样的网站将广泛使用缓存和集群。
This is what is called a multivalue attribute. When this happens you have a separate table that specifies the information you need. So, you can have userid, postid. This would then be your primary key for that table, as userid and postid together would be unique, therefore there would be no duplicates or errors in the database. If you need more information about a post, you can always use a join operator in the query to get more information about a post or user.
Also, because the table is smaller you can cache it for faster access, sites like reddit will use caching extensively and clustering.
我为我正在制作的网站解决了这个问题。
与 Reddit 一样,用户登录后可以在主页上看到 20 多个故事。
将投票表与用户和故事表连接起来并不能非常有效地查明当前登录的用户是否对每个故事进行了投票。
我采取了一种混合方法:
1)制作“投票”表(id、userid、storyid)
2) 将“Voted_Cache”列添加到“故事”表,该表是对故事进行投票的用户 ID 的逗号分隔列表 (CSV)。
现在,当我在主页上加载 20 篇文章时,我可以检查当前的 userid 是否存在于 Story.Voted_Cache 列中,而不需要对 Vote 表进行 JOIN。
“投票”表是权威表,让我知道对哪些故事进行了投票,如果需要,可以从该表重建 Voted_Cache 列。
I tackled this problem for a website I was making.
Like Reddit, a user could be logged in and see 20+ stories on the home page.
JOIN'ing a Vote table against the User and Story table's would not be terribly efficent to find out whether the currently logged in user had voted on each story or not.
I took a hybrid approach of:
1) Making a 'Vote' table (id, userid, storyid)
2) Adding a 'Voted_Cache' column to 'Story' table that was a comma separated list (CSV) of User IDs that have voted on the story.
Now when I load 20 articles on the home page, I can check to see if the current userid exists in the story.Voted_Cache column, instead of needing to do a JOIN to the Vote table.
The 'Vote' table is the authoritative to let me know what stories are voted on, and the Voted_Cache column can be rebuilt from this table if necessary.
此类问题的典型设计模式是为投票的用户创建关联表。关联表可以看起来像
assoc_user_vote - 表名
id - 主键
userid
voteid
中的每条记录 一样简单assoc_user_vote 表有一个唯一的 ID - 可能自动递增或播种,并包含用户和投票 ID。 userid 和 voteid 是各自表中的主键。
此模式支持特定用户的多次投票,并遵循数据规范化最佳实践。 http://en.wikipedia.org/wiki/Database_normalization
A typical design pattern to this type of problem would be to create an association table for the users who vote. The association table could look as simple as
assoc_user_vote - table name
id - primary key
userid
voteid
Each record in the assoc_user_vote table has a unique id - probably auto incremented or seeded and contains a user and vote id. The userid and voteid are primary keys in their respective tables.
This pattern supports many votes by a specific user and follows data normalization best practices. http://en.wikipedia.org/wiki/Database_normalization