当前位置：文江博客话题详情

需要 mySQL 建议 - 如何构建特定的 mysql 数据库/架构 - 内部示例

发布于 2024-12-11 22:27:26 字数 774 浏览 0 评论 0原文

我需要一些关于如何构建数据库的建议。我将告诉您一个一般示例，它应该如何工作。

因此，该网站将拥有数千名用户 - [1 个表用户及其详细信息，我猜 - 好吧]

假设每天都会有管理员发布一些问题/事实，用户可以选择答案，例如 1,2 ,3.每个用户都可以为每个问题选择一个答案。
[我猜有 1 个包含问题的表格 - 或者问题可能可以是单独的表格，具体取决于月份或年份？]

每个对特定问题（例如问题/事实 #54）做出选择的用户都将存储其答案。所以一个简单的想法就是新建一个表来存储它。例如，用户 1，问题 #54，答案 1。

但是，如果用户数以千计，那么想象一下每天 30-40 个问题/事实 * 数千个答案 * 天/年等。我想这太慢了。

另一个想法是为每个用户创建 1 个表，但这真的很糟糕，我认为

想象一下，我需要检索答案的历史记录并在其他模块上快速使用该数据。随着时间的推移，拥有一个包含数百万或无限条目的表将很难搜索，对吧？

表设置将很少写入，实际上很多读取。因为整个站点都需要读取。对于具有最成功答案的用户。您个人资料上所有答案的历史记录。每个问题类别都有正确答案的顶级用户（我忘了说不同类别的问题 - 所以也许每个类别都有新表？大约 5-10 或类似数量的类别）还有每月和每年的统计数据。过去的几年只是出于历史目的，他们的个人资料上没有那么多阅读。（所以也许类别每年也有表格？）这都是关于每个单独用户的统计数据。

所以我的问题是你认为我应该如何构建这个？

提前致谢，

我愿意接受更多想法。还忘了问php+mysql还是aspx+mssql？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

源来凯始玺欢你 2024-12-18 22:27:26

我会考虑做的是有一个问题表和答案表。
每个问题都有其自己的唯一 ID，该 ID 也会出现在响应表中。

QUESTIONS 的示例布局为：QUESTION_ID, TEXT, RESPONSE_VALUES，RESPONSES 的示例布局为 QUESTION_ID, USER_ID, RESPONSE_ID。

这些类型的关系称为“外键”。
您可能还想温习一下“一对多”关系。

回复收藏 0 原文

開玄 2024-12-18 22:27:26

一般来说，设计良好的数据库在查询包含数百万条记录的表时与查询包含数千条记录的表时的速度大致相同 - 只要您可以使用索引访问数据。

另一方面，一旦您在需要之前进行性能优化，维护成本就会迅速上升。

因此，我建议将数据库设计得易于开发人员理解，编写性能测试，并且仅在确实需要时进行优化。

至于您的具体问题：

Table USERS
user_id (primary key)
name
...

table QUESTIONS
question_id (primary key)
question_date
question_text

table ANSWERS
answer_id (primary key)
question_id (foreign key to questions)
answer_text

table user_answers
user_id (foreign key to users)
answer_id (foreign key to answers)
is_correct_flag

在所有键上创建索引，以及（可能）question_date；如果您需要按用户名搜索，请也在该列上创建索引。

现在用 SQL 编写数据访问查询 - 不必完全正确，只要足以让您进行测试即可。然后使用测试数据生成器来填充您的表 - 我过去使用过 DBMonster。将两倍于您预期需要的数据放入数据库中。

现在执行数据访问查询，并测量响应时间。以不同的顺序多次执行此操作 - 在数据库上进行缓存等可能会产生误导性的结果。我发现使用单元测试框架（例如 PHPUnit）封装这些内容很有用 - 这样，您可以多次重新运行相同的测试。

如果幸运的话，您根本不会遇到任何性能问题。如果不是，请使用 EXPLAIN 来优化查询。如果这不起作用，请考虑购买更好的硬件。如果这不起作用，请创建预先计算的“报告表”，将常用请求的数据聚合到一个简单、扁平的结构中，并批量更新或在数据更改时更新。

例如，如果您必须报告一段时间内的用户分数，您可以按照

table USER_SCORE_PERIOD_REPORT
user_id
username
period
score

“我喜欢坚持命名约定”的方式创建一个表格，以确保这些“报告表”得到明确标识，并且不会被误认为是常规“报告表”。交易”表。

但实际上，只有当您知道存在性能问题时才这样做 - 该解决方案会创建更多可能损坏的东西，并有更多出现错误的机会。

In general, a well-designed database is roughly equally fast when querying a table with millions of records as it is when querying a table with thousands of records - as long as you can access data using an index.

On the other hand, the cost of maintenance goes up very quickly once you get to performance optimization before you need to.

So, I'd recommend designing your database to be easy to understand for developers, to write performance tests, and to only optimize when you really, really need to.

As for your specific question:

Table USERS
user_id (primary key)
name
...

table QUESTIONS
question_id (primary key)
question_date
question_text

table ANSWERS
answer_id (primary key)
question_id (foreign key to questions)
answer_text

table user_answers
user_id (foreign key to users)
answer_id (foreign key to answers)
is_correct_flag

Create indexes on all the keys, and (probably) question_date; if you need to search by username, create an index on that column too.

Now write your data access queries in SQL - don't have to be exactly right, just good enough to let you test. Then use a test data generator to populate your tables - I've used DBMonster in the past. Put twice as much data into the database as you expect to ever need.

Now execute your data access queries, and measure the response time. Do this several times, in different sequences - caching etc. on the database can create misleading results. I've found it useful to encapsulate these using a unit testing framework such as PHPUnit - that way, you can re-run the same tests several times.

If you're lucky, you won't have any performance issues at all. If you're not, use EXPLAIN to optimize the queries. If that doesn't work, consider getting better hardware. If that doesn't work, create pre-computed "reporting tables", which aggregate commonly requested data into a simple, flat structure and are updated either in a batch, or when data changes.

For instance, if you have to report on user scores over a period, you might create a table along the lines of

table USER_SCORE_PERIOD_REPORT
user_id
username
period
score

I like to stick to a naming convention to make sure these "reporting tables" are clearly identified, and not mistaken for regular "transactional" tables.

But really, only do this if you know you have a performance problem - this solution creates more things that can break, with more opportunities for bugs.

回复收藏 0 原文

~没有更多了~