MySQL 服务器设计和可扩展性

发布于 2024-12-31 23:56:05 字数 441 浏览 1 评论 0原文

我对 SQL 使用的数据库有一个简单的设计：

Table1 - recipeTable key, value1..N, recipeId
Table2 - recipeInstructions key, instId, text, reciepIdFK (each instruction has entry)
Table3 - recpeIngredients key, ingId, text, recipeIdFK (each ingredient has an entry)

问题是 - 在数据库上进行查询等（连接来自所有 3 个表的信息）时，表大小何时会成为问题（例如，搜索将花费太多时间）。

我应该做一些聚类 - 比如 ids 1..10000 等的表 - 或者 SQL 以它自己的方式做？

表的大小可以达到100,000甚至更多。

约阿夫

原文

I have a simple design for DB used by SQL:

Table1 - recipeTable key, value1..N, recipeId
Table2 - recipeInstructions key, instId, text, reciepIdFK (each instruction has entry)
Table3 - recpeIngredients key, ingId, text, recipeIdFK (each ingredient has an entry)

The question is - for doing queries etc on the DB (with joining info from all 3 tables), when will the tables size be a problem (e.g. search will take too much time).

Shoudl I do some clusturing - like table for ids 1..10000 etc.. - or the SQL does it in its own way?

The table size can get up to 100,000 and even more.

Yoav

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

蓝天白云 2025-01-07 23:56:05

首先，我认为您可能缺少“成分”表。

Recipe
-------
RecipeID PK
Title
...

RecipeInstruction
------
RecipeInstructionID PK
RecipeID FK
sequence
description

Ingredient
---------
IngredientID
Name
....

RecipeIngredient
------------
RecipeID FK
IngredientID FK
quantity

其次，解决可扩展性问题的最佳方法是尝试它们。设计您的数据库，填充测试数据（我过去使用过 DBMonster），然后处理您需要支持的查询。获取的数据量可能是您认为在现实生活中产生的数据量的两倍。您使用 DAL 的事实并不那么重要 - DAL 只是为您生成查询，如果您掌握了正确的基础知识，您可以在将来进行调整。

对于您的应用程序，我猜您会想要运行如下查询：

获取所有包含成分的食谱 x
获取类别中的所有食谱 x
获取所有没有成分的食谱 x
获取所有需要不到 x 分钟的食谱
获取所有包含成分的食谱（ x, y, z)
获得所有难度为 x 的食谱

以及上述的组合。看看你的查询是否有效；如果他们这样做了，就别管了。说真的，不要为了以防万一而“优化”。如果查询不起作用，请调整它们。了解索引和自由文本搜索。

如果这不起作用，请考虑购买更大的硬件。从长远来看，它比奇特的设计解决方案更便宜（例如“一张表最多可容纳 10000 条记录” - 想象一下其数据访问逻辑......）。

一般来说，如果您的查询可以使用索引，它们将在现代经济实惠的硬件上完美地连接数百万个表或记录。可以安全地假设所有“ID”列都将被索引，并且通过索引搜索将会很快。

如果您必须支持通配符（例如“其中成分如'％banana％'），那么搜索文本值可能会很慢。这就是我要花精力的地方 - MySQL 支持自由文本搜索，这非常快。

Firstly, I think you may be missing a table for "ingredients".

Recipe
-------
RecipeID PK
Title
...

RecipeInstruction
------
RecipeInstructionID PK
RecipeID FK
sequence
description

Ingredient
---------
IngredientID
Name
....

RecipeIngredient
------------
RecipeID FK
IngredientID FK
quantity

Secondly, the best way to work on scalability questions is to try them out. Design your database, populate with test data (I've used DBMonster in the past), and then work on the queries you need to support. Go for perhaps twice as much data as you think you'll ever generate in real life. The fact you're using a DAL doesn't matter all that much - the DAL just generates the queries for you, and if you get the basics right, you can tune that in the future.

For your app, I'm guessing you'll want to run queries like:

get all recipes with ingredient x
get all recipes in category x
get all recipes without ingredient x
get all recipes that take less than x minutes
get all recipes with ingredients (x, y, z)
get all recipes with difficulty x

As well as combinations of the above. See if your queries work; if they do, leave well enough alone. Seriously, don't "optimize" just in case. If the queries don't work, tune them. Learn about indexing, and free text search.

If that doesn't work, think about buying bigger hardware. In the long run, it's cheaper than exotic design solutions (like "one table for records up to 10000" - just imagine the data access logic for that...).

In very general terms, if your queries can use an index, they'll perform perfectly adequately joining several tables across many millions or records on modern affordable hardware. It's safe to assume that all your "ID" columns will be indexed, and that searching by index will be fast.

What's likely to be slow is searching on text values if you have to support wildcards (e.g "where ingredient like '%banana%'). That's where I'd put my energy - MySQL has support for free text searching, which is pretty quick.

回复收藏 0 原文