使用PHP优化MySQL查询

发布于 2024-12-12 01:20:46 字数 529 浏览 0 评论 0原文

假设我有以下查询:

选择地址
FROM 地址 a,名称 n
其中 a.address_id = n.address_id
按 n.address_id 分组
计数 (*) >= 10

如果这两个表足够大(想象一下,如果我们在这两个表中包含整个美国人口),那么在此 SELECT 上运行 EXPLAIN 会显示 Usingtemporary;使用 filesort 这通常不好。

如果我们有一个包含许多并发 INSERT 和 SELECT 的数据库(像这样),则将 GROUP BY a.address_id HAVING COUNT(*) >= 10 部分委托给 PHP 是一个很好的计划,可以最大限度地减少数据库资源?对此进行编码的最有效方法(就计算能力而言)是什么?

编辑:似乎共识是卸载到 PHP 是错误的举动。那么,我如何改进查询(假设索引已正确创建)?更具体地说,如何避免数据库创建临时表?

Let's assume I have the following query:

SELECT address
FROM addresses a, names n
WHERE a.address_id = n.address_id
GROUP BY n.address_id
HAVING COUNT(*) >= 10

If the two tables were large enough (think if we had the whole US population in these two tables) then running an EXPLAIN on this SELECT would say that Using temporary; Using filesort which is usually not good.

If we have a DB with many concurrent INSERTs and SELECTs (like this) would delegating the GROUP BY a.address_id HAVING COUNT(*) >= 10 part to PHP be a good plan to minimise DB resources? What would the most efficient way (in terms of computing power) to code this?

EDIT: It seems the consensus is that offloading to PHP is the wrong move. How then, could I improve the query (let's assume indexes have been created properly)? More sepcifically how do I avoid the DB from creating a temporary table?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

独闯女儿国 2024-12-19 01:20:46

那么您的最小化资源计划是从数据库中提取所有数据并让 PHP 处理它,从而导致内存使用量极大?

如果可能的话,不要进行客户端处理 - 数据库是为此类繁重的工作而设计的。

So your plan to minimize resources is by sucking all the data out of the database and having PHP process it, causing extreme memory usage?

Don't do client-side processing if at all possible - databases are DESIGNED for this sort of heavy work.

朮生 2024-12-19 01:20:46

将其卸载到 PHP 可能与您想要的方向相反。如果您必须在单台计算机上执行此操作,那么数据库可能是最有效的位置。如果您有一堆 PHP 机器而只有一个数据库服务器,那么卸载可能是有意义的,但更有可能的是您只会破坏数据库的 IO 能力。通过设置副本并在那里执行读取查询,您可能会获得更大的胜利。根据 SELECT 与 INSERT 查询的比率,您可能需要考虑保留一个统计表(SELECT 比 INSERT 多得多)。您允许的结果延迟越长,您拥有的选择就越多。如果您可以允许 5 分钟的延迟,那么您可能会开始考虑使用像 hadoop 这样的分布式批处理系统,而不是数据库。

Offloading this to PHP is probably the opposite direction you want to go. If you must do this on a single machine then the database is likely the most efficient place to do it. If you have a bunch of PHP machines and only a single DB server, then offloading might make sense, but more likely you'll just clobber the IO capability of the DB. You'll probably get a bigger win by setting up a replica and doing your read queries there. Depending on your ratio of SELECT to INSERT queries, you might want to consider keeping a tally table (many more SELECTs than INSERTs). The more latency you can allow for your results, the more options you have. If you can allow 5 minutes latency, then you might start considering a distributed batch processing system like hadoop rather than a database.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文